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SCARECROW GENE. PROMOTER AND USES THEREOF 

This application is a continuation-in-part of 
co-pending Application No. 08/638,617, filed April 26, 1996, 
5 the disclosure of which is incorporated by reference in its 
entirety. 

This invention was made with government support 
under grant number: GM43778 awarded by the National 
10 Institute of Health. The government may have certain rights 
in the invention. 

1. INTRODUCTION 
The present invention generally relates to the 
15 SCARECROW (SCR) gene family and their promoters. The 

invention more particularly relates to ectopic expression of 
members of the SCARECROW gene family in transgenic plants to 
artificially modify plant structures. The invention also 
relates to utilization of SCARECROW promoter for tissue and 
20 organ specific expression of heterologous gene products. 

2. BACKGROU ND OF THE INVENTTON 

Asymmetric cell divisions, in which a cell divides 
to give two daughters with different fates, play an important 

25 role in the development of all multicellular organisms. In 
plants, because there is no cell migration, the regulation of 
asymmetric cell divisions is of heightened importance in 
determining organ morphology. In contrast to animal 
embryogenesis, most plant organs are not formed during 

30 embryogenesis. Rather, cells that form the apical meristems 
are set aside at the shoot and root poles. These reservoirs 
of stem cells are considered to be the source of all post- 
embryonic organ development in plants. A fundamental 
guestion in developmental biology is how meristems function 

35 to generate plant organs. 
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2.1. ROOT DEVELOPMENT 

Root organization is established during 
embryogenesis. This organization is propagated during 
postembryonic development by the root meristem. Following 
5 germination, the development of the postembryonic root is a 
continuous process, a series of initials or stem cells 
continuously divide to perpetuate the pattern established in 
the embryonic root (Steeves & Sussex, 1972, Patterns in Plant 
Development , Englevood Cliffs, NJ: Prentice-Hall, Inc.)* 

10 Due to the organization of the Arabidopsis root it 

is possible to follow the fate of cells from the meristem to 
maturity and identify the progenitors of each cell type 
(Dolan et al. , 1993, Development 119:71-84). The Arabidopsis 
root is a relatively simple and well characterized organ. 

15 The radial organization of the mature tissues in the 

Arabidopsis root has been likened to tree rings with the 
epidermis, cortex, endodermis and pericycle forming radially 
symmetric cell layers that surround the vascular cylinder 
(FIG. 1A) . See also Dolan et al., 1993, Development 

20 119:71-84. These mature tissues are derived from four sets 
of stem cells or initials: i) the columella root cap initial; 
ii) the pericycle/ vascular initial; iii) the 
epidermal/ lateral root cap initial; and iv) the 
cortex/ endodermal initial (Dolan et al., 1993, Development 

25 119:71-84). It has been shown that these initials undergo 
asymmetric divisions (Scheres et al., 1995, Development 
121:53-62). The cortex /endodermal initial, for example, 
first divides anticlinally (in a transverse orientation) 
(FIG. IB) . This asymmetric division produces another initial 

30 and a daughter cell. The daughter cell, in turn, expands and 
then divides periclinally (in the longitudinal orientation) 
(FIG. IB) . This second asymmetric division produces the 
progenitors of the endodermis and the cortex cell lineages 
(FIG. IB) . 
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2.2. GENES REGULATING ROOT STRUCTURE 
Mutations that disrupt the asymmetric divisions of 
the cortex/ endodermal initial have been identified and 
characterized (Benfey et al., 1993, Development 119:57-70; 
5 Scheres et al. , 1995, Development 121:53-62). short-root 
(shr) and scarecrow (scr) mutants are missing a cell layer 
between the epidermis and the pericycle. In both types of 
mutants the cortex/ endodermal initial divides anticlinally, 
but the subsequent periclinal division that increases the 
10 number of cell layers does not take place (Benfey et al., 
1993, Development 119:57-70; Scheres et al. , 1995, 
Development 121:53-62). The defect is first apparent in the 
embryo and it extends throughout the entire embryonic axis 
which includes the embryonic root and hypocotyl (Scheres et 
15 al., 1995, Development 121:53-62). This is also true for the 
other radial organization mutants characterized to date, 
suggesting that radial patterning that occurs during 
embryonic development may influence the post-embryonic 
pattern generated by the meristematic initials (Scheres et 
20 al., 1995, Development 121:53-62). 

Characterization of the mutant cell layer in shr 
indicated that two endodermal-specif ic markers were absent 
(Benfey et al., 1993, Development 119:57-70). This provided 
evidence that the wild-type SHR gene may be involved in 
25 specification of endodermis identity. 

2.3. GEOTROPTSM 

In plants, the capacity for gravitropism has been 
correlated with the presence of amyloplast sedimentation. 

30 See, e.g., Volkmann and Sievers, 1979, Encyclopedia Plant 
Physiol., M.S. vol 7, pp. 573-600; Sack, 1991, Intern. Rev. 
Cytol. 127:193-252; BjSrkmann, 1992, Adv. Space Res. 12:195- 
201; Poff et al., in The Physiology of Trnp^^c, Meyerowitz & 
Somerville (eds) ; Cold Spring Harbor Laboratory Press, 

35 Plainview, MY (199<) pp. 639-664; Barlow, 1995, Plant'cell 
Environ. 18:951-962. Amyloplast sedimentation only occurs in 
cells in specific locations at distinct developmental stages. 
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That is, when and where sedimentation occurs is precisely 
regulated (Sack, 1991, Intern. Rev. Cytol. 127:193-252). In 
roots, amyloplast sedimentation only occurs in the central 
(columella) cells of the rootcap; as these cells mature into 
5 peripheral cap cells, the amyloplasts no longer sediment 
(Sack & Kiss, 1989, Amer. J. Bot. 76:454-464; Sievers & 
Braun, in The Root Cap: Structure and Function . Wassail et 
al. (eds.), New York: M. Dekker (1996) pp. 31-49). In stems 
of many plants, including Arabidopsis, amyloplast 
10 sedimentation occurs in the starch sheath (endodermis) 

especially in elongating regions of the stem (von Guttenberg, 
Die Phvsioloaischen Scheiden . Handbuch der Pf lanzenanatomie; 
K. Linsbauer (ed.), Berlin: Gebruder Borntraeger, vol. 5 
(1943) p. 217; Sack, 1987, Can. J. Bot. 65:1514-1519; Sack, 
15 1991, Intern. Rev. Cytol. 127:193-252; Caspar & Pickard, 
1989, Planta 177:185-197; Volkmann et al., 1993, J. PI. 
Physiol. 142:710-6) . 

Gravitropic mutants have been studied for evidence 
that proves the role of amyloplast sedimentation in gravity 
20 sensing. However, many gravitropic mutations affect 

downstream events such as auxin sensitivity or metabolism 
(Masson, 1995, BioEssays 17:119-127). Other mutations seem 
to affect gene products that process information from gravity 
sensing. For example, the lazy mutants of higher plants and 
25 comparable mutants in mosses can clearly sense and respond to 
gravity, but the mutations reverse the normal polarity of the 
gravitropic response (Gaiser & Lomax, 1993, Plant Physiol. 
102:339-344; Jenkins et al. f 1986, Plant Cell Environ 9:637- 
644). Other mutations appear to affect gravitropism of 
30 specific organs. For example, sgr mutants have defective 
shoot gravitropism (Fukaki et al., 1996, Plant Physiol. 
110:933-943; Fukaki et al., 1996, Plant Physiol. 110:945-955; 
Fukaki et al., 1996, Plant Res. 109:129-137). 

Citation or identification of any reference herein 
35 shall not be construed as an admission that such reference is 
available as prior art to the present invention. 
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3. SUMMARY OF THE INVENTION 
The structure and function of a regulatory gene, 
SCARECROW (SCR), is described. The SCR gene is expressed 
specifically in root progenitor tissues of embryos, and in 
5 certain tissues of roots and stems. SCR expression controls 
cell division of certain cell types in roots, and affects the 
organization of root and stem. The invention relates to the 
SCARECROW (SCR) gene (which encompasses the Arabidopsis SCR 
gene and its orthologs and paralogs) , SCR gene products, 
10 (including but not limited to transcriptional products such 
as mRNAs, antisense and ribozyme molecules, and translational 
products such as the SCR protein, polypeptides, peptides and 
fusion proteins related thereto) , antibodies to SCR gene 
products, SCR regulatory regions and the use of the foregoing 
15 to improve agronomically valuable plants. 

The invention is based, in part, on the discovery, 
identification and cloning of the gene responsible for the 
scarecrow phenotype. In contrast to the prevailing view that 
the SCR gene was likely to be involved in the specification 
20 of endodermis, the inventors have determined that the mutant 
cell layer in roots of scr mutants has differentiated 
characteristics of both cortex and endodermis. This is 
consistent with a role for SCR in the regulation of the 
asymmetric cell division rather than in specification of the 
25 identity of either cortex or endodermis. The inventors have 
also determined that SCR expression affects the gravitropism 
of plant aerial structures such as the stem. 

One aspect of the invention relates to the 
heterologous expression of SCR genes and related nucleotide 
30 sequences, and specifically the Arabidopsis SCR genes, in 
stably transformed higher plant species. Modulation of SCR 
expression levels can be used to advantageously modify root 
and aerial structures of transgenic plants and enhance the 
agronomic properties of such plants. 
35 Another aspect of the invention relates to the use 

of promoters of SCR genes, and specifically the use of 
Arabidopsis SCR promoter to control the expression of protein 
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and RNA products in plants. Plant SCR promoters have a 
variety of uses, including but not limited to expressing 
heterologous genes in the embryo, root, root nodule, and stem 
of transformed plants. 
5 The invention is illustrated by working examples 

described infra which demonstrate the isolation of the 
Arabidopsis SCR gene using insertion mutagenesis. More 
specifically, T-DNA tagging of genomic and cDNA clones of the 
Arabidopsis SCR gene are described. Additional working 
10 examples include the isolation of SCR sequences from plant 

genomes using PCR amplification in combination with screening 
of genomic libraries, and heterologous gene expression in 
transgenic plants using SCR promoter expression constructs. 

Structural analysis of the deduced amino acid 
15 sequence of Arabidopsis SCR protein indicates that SCR 

encodes a transcription factor. Northern analysis, in situ 
hybridization analysis and enhancer trap analysis show highly 
localized expression of Arabidopsis SCR in embryos and roots. 
Genetic analysis shows SCR expression also affects 
20 gravitropism of aerial structures (e.g., stems). This 
indicates that SCR is also expressed in those structures. 

Computer analysis of the deduced amino acid 
sequence of Arabidopsis SCR protein with those of Expressed 
Sequence Tag (EST) sequences in GenBank reveals the existence 
25 of at least thirteen SCR genes in Arabidopsis, one SCR gene 
in maize, four SCR genes in rice, and one SCR gene in 
Brassica. A further aspect of the invention relates to the 
use of such EST sequences to obtain larger and/or complete 
clones of the corresponding SCR gene. 
30 The various embodiments of the claimed invention 

presented herein are by the way of illustration and are not 
meant to limit the invention. 



3.1. DEFINITIONS 
35 As used herein, the terms listed below will have 

the meanings indicated. 
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35S 



cDNA 



= cauliflower mosaic virus promoter for the 35S 
transcript 

= complementary DNA 



cis-regulatory 
5 element = 



10 



15 



coding 
sequence = 



DNA 
EST 

functional 
portion 



A promoter sequence 5' upstream of the TATA 
box that confers specific regulatory response 
to a promoter containing such an element. A 
promoter may contain one or more cis- 
regulatory elements, each responsible for a 
particular regulatory response 



sequence that encodes a complete or partial 
gene product (e.g., a complete protein or a 
fragment thereof) 



deoxyribonucleic acid 
expression tagged 



a functional portion of a promoter is any 
portion of a promoter that is capable of 
causing transcription of a linked gene 
sequence, e.g., a truncated promoter 



20 



gene 
fusion 



gene 
25 product 

gene 
sequence 



30 



GUS 



gDNA 



heterologous 
gene = 



35 



a gene construct comprising a promoter 
operably linked to a heterologous gene, 
wherein said promoter controls the 
transcription of the heterologous gene 



the RNA or protein encoded by a gene sequence 

sequence that encodes a complete gene product 
(e.g., a complete protein) 

1 w 3-0-Glucuronidase 

genomic DNA 



In the context of gene constructs, a 
heterologous gene means that the gene is 
linked to a promoter that said gene is not 
naturally linked to. The heterologous gene 
may or may not be from the organism 
contributing said promoter. The heterologous 
gene may encode messenger RNA (mRNA) , 
antisense RNA or ribozymes 
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10 



homologous 
promoter 



mRNA 

operably 
linked 



ortholog 



para log 



a native promoter of a gene that selectively 
hybridizes to the sequence of a SCR gene 
described herein 

messenger RNA 



A linkage between a promoter and gene sequence 
such that the transcription of said gene 
sequence is controlled by said promoter 



related gene in a different plant (e.g. 
ZCARECRON gene is an ortholog of the 
Arabidopsis SCR gene) 



maize 



15 



20 



RNA 

RNase 

SCR 

(italic) 



SCR 

scr ' 

(lower case) 
ZCR 



related gene in the same plant (e.g., 
Arabidopsis SRPal is a paralog of Arabidopsis 
SCR gene) 

ribonucleic acid 

ribonuclease 

SCARECROW gene or gene product, encompasses 
SCR and ZCR genes and their orthologs and 
paralogs 

SCARECROW protein 

scarecrow mutant (e - g • , scrl ) 



25 



30 



35 



= maize ZCARECROW gene, a paralog of, for 
example, the Arabidopsis SCR gene 

SCR protein means a protein containing sequences or 
a domain substantially similar to one or more motifs (i.e., 
Motif I -VI), preferably MOTIF III (VHIID) , of Arabidopsis SCR 
protein as shown in FIGS. 13A-F and FIGS. 15A-S. SCR 
proteins include SCR ortholog and paralog proteins having the 
structure and activities described herein. 

SCR polypeptides and peptides include deleted or 
truncated forms of the SCR protein, and fragments 
corresponding to the SCR motifs described herein. 

SCR fusion proteins encompass proteins in which the 
SCR protein or an SCR polypeptide or peptide is fused to a 
heterologous protein, polypeptide or peptide. 
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SCR gene, nucleotides or coding sequences means 
nucleotides, e.g., gDNA or cDNA encoding SCR protein, SCR 
polypeptides or peptides, or SCR fusion proteins. 

SCR gene products include transcriptional products 
5 such as mRNAs, antisense and ribozyme molecules, as well as 
translational products of the SCR nucleotides described 
herein including but not limited to the SCR protein, 
polypeptides, peptides and/or SCR fusion proteins. 

SCR promoter means the regulatory region native to 
10 the SCR gene in a variety of species, which promotes the 

organ and tissue specific pattern of SCR expression described 
herein. 



4. BRIEF DE SCRIPTION OF THE FIGURE S 

15 FIGS. 1A-B. Schematic of Arabidopsis root anatomy. 

FIG. 1A. Transverse section showing the four tissues, 
epidermis, cortex, endodermis and pericycle that surround the 
vascular tissue. In the longitudinal section, the 
epidermal /lateral root cap initials and the cortex/ endodermal 
20 initials are shown at the base of their respective cell 
files. FIG. IB. Schematic of division pattern of the 
cortex/ endodermal initial. The initial expands then divides 
anticlinally to reproduce itself and a daughter cell. The 
daughter then divides periclinally to produce the progenitors 
25 of the endodermis and cortex cell lineages. Abbreviations: 
C, cortex; Da, daughter cell; E, endodermis; In, initial. 

FIGS. 2A-F. Phenotype of scr mutant plants. 
FIG. 2A. Shown left to right are 12-day scr-2 , scr-l and 
wild-type seedlings grown vertically on nutrient agar medium. 
30 FIG. 2B. 21-day scr-2 mutant plants in soil. FIG. 2C. 

Transverse section through primary root of 7-day scr-2. FIG. 
2D. Transverse section through primary root of 7 -day wild- 
type (WT) . fig. 2E. Transverse section through lateral root 
of 12-day scr-l mutant seedling. FIG. 2F. Transverse 
35 section through root regenerated from scr-l callus. Bar, 50 
Mm. Abbreviations: C, cortex; En, endodermis; Ep, epidermis; 
M, mutant cell layer; P, pericycle; V, vascular tissue. 



- 9 - 



WO 97/41152 



PCT/US97/07022 



FIGS. 3A-F. Characterization of the cellular 
identity of the mutant cell layer- FIG. 3A. Endodermis- 
specific Casparian band staining of transverse sections 
through the primary root of 7-day scr-2 mutant. (Note: the 
5 histochemical stain also reveals xylem cells in the vascular 
cylinder.) FIG. 3B. Casparian band staining of transverse 
sections through the primary root of 7-day wild-type (WT) . 
FIG. 3C. Immunostaining with the endodermis (and a subset of 
vascular tissue) specific JIM13 monoclonal antibodies on 
10 transverse root sections of scr-2 mutant. FIG. 3D. 

Immunostaining with JIM13 monoclonal antibodies on transverse 
root sections of WT. FIG. 3E. Immunostaining with the JTM7 
monoclonal antibody that stains all cell walls on transverse 
root sections of scr-2 mutant. FIG. 3F. Immunostaining with 
15 JIM7 monoclonal antibodies on transverse root sections of WT. 
Bar, 25 /xm. Abbreviations are same as those for description 
of FIGS. 2A-2F and: Ca, casparian strip. 

FIGS. 4A-F. Immunostaining. FIG. 4A. 
Immunostaining with the cortex (and epidermis) specific CCRC- 
20 M2 monoclonal antibodies on transverse root sections of scr-2 
mutant. FIG. 4B. Immunostaining with CCRC-M2 antibodies on 
transverse root sections of scr-2 mutant. FIG. 3C. 
Immunostaining with CCRC-M2 antibodies on transverse root 
sections of wild-type (WT) . FIG. 4D. Immunostaining with 
25 the CCRC-M1 monoclonal antibodies (specific to a cell wall 
epitope found on all cells) on transverse root sections of 
scr-1. FIG. 4E. Immunostaining with CCRC-M1 antibodies on 
transverse root sections of FIG. 4F. Immunostaining 

with CCRC-M1 antibodies on transverse root sections of WT. 
30 Bar, 30 pm. Abbreviations are same as those for description 
Of FIGS. 2A-2F. 

FIG. 5A-E. Structure of the Arabidopsis SCARECROW 
gene. FIG. 5A. Nucleic acid sequence and deduced amino acid 
sequence of the Arabidopsis SCR genomic region (SEQ ID NO:l) 
35 and (SEQ ID NO: 2) , respectively. Regulatory sequences 
including: (i) TATA box, (ii) ATG start codon, and (iii) 
potential polyadenylation sequence are underlined. Within 
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the deduced amino acid sequence homopolymeric repeats are 
underlined. FIG. 5B. Schematic diagram of genomic clone 
indicating possible functional motifs, T-DNA insertion sites 
and subclones used as probes. Abbreviations: Q,S,P,T, region 
5 with homopolymeric repeats of these amino acids; b, region 
with similarity to the basic region of bZIP factors; I and 
II, regions with leucine heptad repeats; E, acidic region. 
FIG. 5C. Comparison of the charged region found in 
Arabidopsis SCR protein with that found in bZIP transcription 
10 factors, SCR bZIP-like domain (SEQ ID NO: 3), GCN4 (SEQ ID 

NO:4), TGA1 (SEQ ID NO:5), C-Fos (SEQ ID NO:6), c-JUN (SEQ ID 
NO:7), CREB (SEQ ID NO:8), 0pague-2 (SEQ ID NO:9) , OBF2 (SEQ 
ID NO: 10), RAF-l (SEQ ID NO: 11). FIG. 5D. Translations of 
EST clones encoding putative peptide having similarities to 
15 the VHIID domain region of Arabidopsis SCR protein (SEQ ID 
NO: 12), F13896 (SEQ ID NO: 13), Z37192 (SEQ ID NO: 14), and 
Z25645 (SEQ ID NO: 15) are from Arabidopsis, T18310 (SEQ ID 
NO: 17) is from maize and D41474 (SEQ ID NO: 16) is from rice. 
FIG. 5E. The deduced amino acid sequence of the Arabidopsis 
20 SCARECROW gene (SEQ ID NO: 2). 

FIGS. 6A-B. Expression of the Arabidopsis 
SCARECROW gene. FIG. 6A. Northern blot of total RNA from 
wild-type siliques (Si) , roots (R) , leaves (L) and whole 
seedlings (Sd) hybridized with Arabidopsis SCR probe a and 
25 with a probe from the Arabidopsis glutamine dehydrogenase 
(GDH) gene (Melo-Oliveira et al., 1996, Proc. Natl. Acad. 
Sci. USA 93:4718-4723) as a control for RNA integrity. (GDH 
expression is lower in siliques than in vegetative tissues.) 
The 1.6 kb band corresponds to the GDH gene and the 
30 approximately 2.5 kb band corresponds to SCR. Ribosomal RNA 
is shown as a loading control. FIG. 6B. Northern blot of 
Arabidopsis wild-type, scr-l and scr-2 total RNA, probed with 
Arabidopsis SCR probe "a" corresponding to a cDNA sequence 
shown in fig. 5B, and with the GDH probe, in scr-2 mutant 
35 additional bands of 4.1 kb and 5.0 kb were detected. 

FIGS. 7A-G. Jji situ hybridization and enhancer 
trap analyses of Arabidopsis SCR expression, fig. 7A. SCR 
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RNA expression detected by in situ hybridization of SCR 
antisense probe to a longitudinal section through the root 
meristem. FIG. 7B. In. situ hybridization of SCR antisense 
probe to a transverse section in the meristematic region. 
5 FIG. 7C. In situ hybridization of SCR antisense probe to 
late torpedo stage embryo. FIG. 7D. Negative control in 
situ hybridization using a SCR sense probe to a longitudinal 
section through the root meristent. FIG. 7E. GUS expression 
in a whole mount in the enhancer trap line, ET199 in primary 
10 root tip. FIG. 7F. GUS expression in the ET199 line in 

transverse root section in the meristematic region. FIG. 7G. 
GUS expression in ET199 detected in a section through the 
root meristem. GUS expression is observed in the 
cortex/ endodermal initial, and in the first cell in the 
15 endodermal cell lineage but not in the first cell of the 
cortex lineage. Expression in two endodermal layers is 
observed higher up in the root because the section was not 
median at that point. Bar, 50 pm. Abbreviations are same as 
those in the description of FIGS. 2A-2F. 
20 FIG. 8. Partial nucleotide sequence (SEQ ID NO: 18) 

and deduced amino acid sequence (SEQ ID NO: 19) of the 
Arabidopsis SRPa4 gene. 

FIG. 9. Partial nucleotide sequence (SEQ ID NO: 20) 
and deduced amino acid sequence (SEQ ID NO: 21) of the 
25 Arabidopsis SRPa3 gene. 

FIG. 10. Partial nucleotide sequence (SEQ ID 
NO: 22) of the Arabidopsis SRPal gene. 

FIG. 11A. Nucleotide sequence (SEQ ID NO: 24) and 
deduced amino acid sequence (SEQ ID NO: 25) of the maize Zm- 
30 Sell fragment. 

FIG. 11B. Partial nucleotide sequence (SEQ ID 
NO: 25) and deduced amino acid sequence (SEQ ID NO: 26) of the 
maize SRPml gene (Zm-Scl2) . 

FIG, 12A-B. Nucleotide sequence of rice SRPo3 EST 
35 clone. FIG. 12A. Sequence of 5' end of EST clone (SEQ ID 
NO:28). FIG. 12B. Sequence of 3' end of EST clone (SEQ ID 
NO:29) . 
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FIGS. 13A-F. Comparison of the amino acid sequence 
of members of the SCARECROW family of genes. Conserved 
Motifs I through VI are indicated by dashed line above the 
aligned sequences. Consensus sequences are shown in bold. 
5 See Table l for the identity and sequence identifier number 
of each of the sequences shown in this Figure. Hu-scr-1 = 
Human SCR paralog (SEQ id NO: 40). 

FIG. 14. Restriction map of the approximately 8.8 
kb Eco Rl insert DNA of lambda clone, t643, containing the 
10 Arabidopsis SCR gene. The locations of the approximately 5.6 
kb Hindlll-SacI fragment subcloned in plasmid LIG l-3/SAC+MoB 2 
1SAC, and the SCR coding region are indicated below the 
restriction map. The location of the trans lational 
initiation site of the SCR gene is at the Nco I site at the 
15 left end of the indicated coding region. The SCR coding 
sequence begins at the translation initiation site and 
extends approximately 1955 nucleotides to its right. E. coll 
DH5a containing plasmid pLIGl-3/SAC+MoB 2 1SAC, has the ATCC 
accession number 98031. 
20 FIGS. 15A-S. Comparison of the partial and 

complete amino acid sequences of several plant members of the 
SCARECROW family of genes. The amino acid sequences are 
aligned in a manner that maximizes amino acid sequence 
similarity and identity among SCR family members. Each 
25 sequence shown is continuous except where noted otherwise; 
the dots are inserted between two sequence segments in order 
to align homologous segments. "x M in the middle of a 
sequence indicates ambiguity in the corresponding nucleotide 
sequence and, possible termination of the ORF at the "X" 
30 residue site. -X" at the end of a sequence indicates 
termination of the ORF at the «x« residue site. The 
numbering of the amino acid residues is shown at the bottom 
of each figure and is based on the Arabidopsis SCR amino acid 
sequence. Conserved Motifs I through VI are indicated by the 
35 various dashed lines above the figures. The new and old 
names of the family members are shown in FIG. 15A. The 
sequences of SCR, Tf 1 and Tf4 are of the complete SCR 
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protein. See Table 1 for the identity and the sequence 
identifier number of each sequence shown in these figures. 

FIGS. 16A-M. The partial nucleotide sequences of 
several plant members of the SCARECROW family of genes. H N M 
.5 indicates an unknown base. See Table 1 for the identity and 
the sequence identifier number of each sequence shown in 
these figures. 

FIG. 17A. The partial nucleotide sequence (SEQ ID 
NO: 66) of the maize ZCR gene. 
10 FIG. 17B. The partial amino acid sequence (SEQ ID 

NO: 67) of the maize ZCR gene. The underlined sequence shares 
approximately 80% sequence identity with a corresponding 
sequence of Arabidopsis SCR protein. 

FIG. 18. Comparison of the partial amino acid 
15 sequences of several SCR ortholog sequences amplified from 
the genomes of carrot , soybean and spruce. The SRPdl and 
SRPpl sequences each were obtained by PCR amplification using 
a combination of IF and 1R primers. The SRPgl sequence was 
obtained by PCR amplification using a combination of IF and 
2 0 WP primers. The amino acid sequences are aligned in a 
manner that maximizes amino acid sequence identity and 
similarity amongst these sequences. Each sequence shown is 
continuous except where noted otherwise; the dashes are 
inserted between two sequence segments in order to allow 
25 alignment of homologous segments. M x" in the middle of a 
sequence indicates ambiguity in the corresponding nucleotide 
sequence and, possible termination of the ORF or existence of 
an intron at the "x" residue site. See Table 1 for the 
identity and the sequence identifier number of each sequence 
30 shown in this figure. 

FIG. 19. Comparison of promoter activities in 
transgenic lines and roots. Panel a. A stably transformed 
line containing four copies of the B2 subdomain of the 35S 
promoter of CaMV upstream of GUS (Benfey et al., 1990). GUS 
35 is expressed in the root tip. Panel b. Roots emerging from 
callus transformed with four copies of the B2 subdomain of 
the 35S promoter fused to GUS. GUS expression can be seen in 
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the emerging root tips (arrows) . Panel c. Higher 
magnification of a root emerging from the callus in panel b. 
GUS is clearly restricted to the root tip. The morphology of 
roots regenerated from calli often appears abnormal. Panel 
5 d. A transgenic plant regenerated from the calli and roots 
shown in panel b. GUS expression in this plants appears to 
be similar to that of the original line shown in panel a. 
Panel e. ET199, a stably transformed line that contains an 
enhancer trapping construct with a minimal promoter fused to 
10 the GUS coding region inserted 1 kb upstream from the SCR 

coding region. GUS expression is primarily in the endodermal 
layer of the root. Panel f. Roots emerging from calli 
transformed with the SCR promoter :: GUS construct. Expression 
of the GUS gene appears to be limited to an internal layer 
15 (arrows). Panel g. SCR promoter :: GUS transformed root in 
liguid culture. Roots shown in panel f were excised and 
transferred to liguid cultures. GUS expression is primarily 
found in the endodermal layer as in ET199. The expression of 
GUS in the guiescent center, as seen here, is also sometimes 
20 observed in ET199. Bar, 50/xm. 

FIG. 20. Analysis of SCR promoter activity in the 
scr mutant background. Panel a. Roots emerging from scr 
calli transformed with the SCR promoter :: GUS construct. 
Roots regenerated from scr calli are very short. GUS 
25 expression appears to be limited to an internal layer of the 
root (arrows) . Panel b. Root regenerated from transformed 
scr calli and transferred to liguid culture. The scr 
phenotype, a single layer between the epidermis and 
pericycle, is easily seen. GUS expression is limited to this 
30 mutant layer. E, Epidermis. M, Mutant Layer. P, Pericycle. 
Bar, 50pm. 

FIG. 21. Molecular Complementation of the sex- 
mutant. Panels a, o and e. scr transformed with the SCR 
promoter:: GUS construct. Panels b, d and f. scr transformed 
35 with the SCR promoter: : SCR coding region construct. Panels a 
and b. Roots emerging from scr calli. Arrows point to 
several very short roots among many fine root hairs in the 
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scr calli transformed with the SCR promoter :: GUS construct. 
In contrast, roots from scr calli transformed with the SCR 
promoter :: SCR coding region construct appeared to be 
wild-type in length, suggesting molecular complementation by 
5 the transgene. Panels o and d. Transgenic roots in liquid 
culture. The scr roots transformed with the SCR 
promoter :: GUS construct appeared short, while those 
transformed with the SCR promoter : : SCR coding region 
construct appeared of wild-type length. Panels e and f. 

10 Transverse sections through roots emerging from calli. 
Whereas there is only a single cell layer between the 
epidermis and stele in the SCR promoter: : GUS transformed 
root, the radial organization of the root transformed with 
the SCR pr omoter : : SCR coding region appeared identical to 

15 wild-type, with both cortex and endodermal layers. E, 

epidermis. M, mutant layer. C, cortex. En, Endodenuis. P, 
Pericycle. Bar, 50jim 

FIG. 22. Expression of ZCR in maize root tips. 
Left Panel. Expression of ZCR is in the endodermal layer and 

2 0 extends down through the region of the quiescent center. 
Right Panel* Higher magnification showing expression in a 
single cell layer through the quiescent center. 



5. DETAILED DESCRIPTIO N OF THE INVENTION 

The invention relates to the SCARECROW (SCR) gene, 
SCR gene products, including but not limited to 
transcriptional products such as mRNAs, antisense and 
ribozyme molecules, and translational products such as the 
SCR protein, polypeptides, peptides and fusion proteins 
related thereto; antibodies to SCR gene products? SCR 
regulatory regions; and the use of the foregoing to improve 
agronomically valuable plants. 

In summary, the data described herein show the 
identification of SCR, a gene involved in the regulation of a 
specific asymmetric division, in controlling gravitropic 
response in aerial structures, and in controlling pattern 
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formation in roots. Sequence analysis shows that the SCR 
protein has many hallmarks of transcription factors. in situ 
and marker line expression studies show that SCR is expressed 
in the cortex/ endodermal initial of roots before asymmetric 
5 division occurs, and in quiescent center of regenerating 
roots. Together, these findings indicate that SCR gene 
regulates key events that establish the asymmetric division 
that generates separate cortex and endodermal cell lineages, 
and that affect tissue organization of roots. The 

10 establishment of these lineages is not required for cell 

differentiation to occur, because in the absence of division 
the resulting cell acquires mature characteristics of both 
cortex and endodermal cells. However, it is possible that 
scr functions to establish the polarity of the initial before 

15 cell division, or that it is involved in generating an 
external polarity that has an effect on asymmetric cell 
division. 

Genetic analysis indicates that SCR expression 
affects gravitropism of plant stems and hypocotyls. This 

20 indicates that SCR is also expressed in these aerial 
structures of plants. 

The scr genes and promoters of the present 
invention have a number of important agricultural uses. The 
SCR promoters of the invention may be used in expression 

25 constructs to express desired heterologous gene products in 
the embryo, root, root nodule, and starch sheath layer in 
stem of transgenic plants transformed with such constructs. 
For example, SCR promoters may be used to express disease 
resistance genes such as lysozymes, cecropins, maganins, or 

30 thionins for anti-bacterial protection or the pathogenesis- 
related (PR) proteins such as glucanases and chitinases for 
anti-fungal protection. SCR promoters also may be used to 
express a variety of pest resistance genes in the 
aforementioned plant structures and tissues. Examples of 

35 useful gene products for controlling nematodes or insects 
include Bacillus thuringiensis endotoxins, protease 
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inhibitors, collagenases , chitinase, glucanases, lectins, and 
glycosidases. 

Gene constructs that express or ectopically express 
SCR, and the SCJ?-suppression constructs of the invention may 
5 be used to alter the root and/or stem structure, and the 
gravitropism of aerial structures of transgenic plants. 
Since SCR regulates root cell divisions, over express ion of 
SCR can be used to increase division of certain cells in 
roots and thereby form thicker and stronger roots. Thicker 

10 and stronger roots are beneficial in preventing plant 

lodging. Conversely, suppression of SCR expression can be 
used to decrease cell division in roots and thereby form 
thinner roots. Thinner roots are more efficient in uptake of 
soil nutrients. Since SCR affects gravitropism of aerial 

15 structures, over express ion of SCR may be used to develop 

"straighter" transgenic plants that are less susceptible to 
lodging . 

Further, SCR gene sequence may be used as a 
molecular marker for a qualitative trait, e.gr., a root or 

20 gravitropism trait, in molecular breeding of crop plants. 

For purposes of clarity and not by way of 
limitation, the invention is described in the subsections 
below in terms of (a) SCR genes and nucleotides; (b) SCR gene 
products; (c) antibodies to SCR gene products; (d) SCR 

25 promoters and promoter elements; (e) transgenic plants which 
ectopically express SCR; (f ) transgenic plants in which 
endogenous SCR expression is suppressed; and (g) transgenic 
plants in which expression of a transgene of interest is 
controlled by SCR promoter* 

30 

5.1. SCR GENES 

The SCARECROW genes and nucleotide sequences of the 
invention include: (a) a gene listed below in Table 1 
(hereinafter, a gene comprising any one of the nucleotide 
35 sequences shown in FIG. 5A, FIG. 8, FIG. 9, FIG. 10, FIGS. 

11A-B, FIGS. 12A-B, FIGS. 16A-M, or FIG. 17A, or a segment of 
such nucleotide sequences) , or as contained in the clones 
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described herein and deposited with the ATCC (see Section 13, 
' infra) ; (b) nucleotide sequence that encodes a protein 
comprising any one of the amino acid sequences shown in FIG. 
5A, FIG. 5D, FIG. 5E, FIG. 8, FIG. 9, FIGS. 11A-B, FIGS. 13A- 
5 F, FIGS. 15A-S, FIG. 17B or FIG. 18 or a segment of such 
amino acid sequences, or that is encoded by any one of the 
genes and/or nucleotide sequences listed by their sequence 
identifier numbers in Table 1, or any segment of such genes 
and/or nucleotide sequences, or contained in any one of the 
10 clones described herein and deposited with the ATCC (see 
Section 13, infra); (c) any gene comprising nucleotide 
sequence that hybridizes to the complement of any one of the 
genes and/or nucleotide sequences listed by their sequence 
identifier numbers in Table 1, or any segment of such genes 
15 and/or nucleotide sequences, or as contained in any one of 
the clones described herein and deposited with the ATCC, 
under highly stringent conditions, e.g., hybridization to 
filter-bound DMA in 0.5 M NaHP0«, 7% sodium dodecyl sulfate 
(SDS) , 1 mM EDTA at 65°C, and washing in 0.1xSSC/0.1% SDS at 
20 68»C (Ausubel F.M. et al. , eds. , 1989, Current Protocols in 
Molecular Biology, Vol. I, Green Publishing Associates, Inc., 
and John Wiley & sons. Inc., New York, at p. 2.10.3) and that 
encodes a gene product functionally equivalent to SCR gene 
product encoded completely or partly by any one of the genes 
25 and/or sequences listed in Table 1 or any segment of such 

genes and nucleotide sequences, or as contained in any one of 
the clones deposited with the ATCC; (d) any gene comprising 
nucleotide sequence that hybridizes to the complement of any 
one of the sequences listed by their sequence identifier 
30 numbers in Table l, or any segment of such nucleotide 

sequences, or as contained in any one of the clones described 
herein and deposited with the ATCC, under less stringent 
conditions, such as moderately stringent conditions, e.g., 
washing in 0.2xSSC/0.1% SDS at 42°C (Ausubel et al., 1989, 
35 supra) , and which encodes a functionally eguivalent SCR gene 
product; (e) any gene comprising nucleotide sequence that 
hybridizes to the complement of any one of the seguences 
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listed by their sequence identifier numbers in Table 1 or any 
segment of such nucleotide sequences, or as contained in any 
one of the clones described herein and deposited with the 
ATCC, under the following low stringency conditions: pre- 
5 hybridization in hybridization solution (HS) containing 43% 
formamide, 5xSSC, 1% SDS, 10% dextran sulfate, 0.1% sarkosyl, 
2% block (Genius kit, Boehringer-Mannheim) , followed by 
hybridization overnight at 30 to 3 3°C using as a probe a DMA 
molecule of approximately 1.6 kb of SEQ ID NO: 1 at a 

10 concentration of 20 ng/ml, followed by washing in 2xSSC/0.1% 
SDS two times for 15 minutes at room temperature and then two 
times at 50 °C, and which encodes a functionally equivalent 
SCR gene product; and/or (f) any gene comprising nucleotide 
sequence that encodes a polypeptide or protein containing the 

15 consensus sequence for SCR (i.e., MOTIF III or VHIID) shown 
in FIGS. 13B-D or a segment of such polypeptide or protein. 
The partial and complete nucleotide and amino acid sequences 
of SCR genes and encoded proteins and polypeptides included 
in the invention are listed in Table 1 below. 



25 



30 



35 
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Table 1. SCR ORTHOLOGS AND PARALOGS 



New Name Old Name 
5 ARABIDOPSIS 

1110 



10 



15 



20 



25 



SRPal 
SRPa2 
SRPa3 

SRPa4 
SRPa5 

SRPae 
SRPa7 

SRPa8 

SRPa9 

SRPalO 
SRPaii 
SRPal2 

SRPal3 
SCR 



Tf4 
3935 

4818 
4871 
12398 
3635 

Tfl 

10964 

11261 
18652 
23196 

33/08 

Scr 



EST Clone 1 

Z25645/33772 

Z34599 

Z37192/1 
N96166 

F13896/7 

F13949 

R29793 

T21627 
H76979 
N96767 

T46205 (9468) 
N96653 (21711) 

T78186 
T44774 

T76483 

N37425 

W43803 

W435138 

AA042397 

T46008 

N.A. 2 



SEQ ID NOs 
Nucleotide 3 Amino Acid 



22 

20 

18 
45 
51 
55 



47 

49 
53 
57 



23 

35* 

21 

19 
46 

52 
56 

34* 

48 

50 
54 
58 

41 
2* 



RICE 

SRPol 
SRPo2 



30 



SRP03 
SRPo4 

MAIZE 
35 SRPml 

9RASSICA 
SRPbl 



713 
2504 



3989 
11846 

18310 

174 



D15490 

D40482 
D40607 
D40800 
D41389 

D41474 

C20324 



T18310 
H74669 



43 
44 



36 
59 

37 

42 
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Table 1. (Continued; 



New Ncjffle 
CARROT 

SRPdl 
5 SOYBEAN 

SRPgl 
SPRUCE 

SRPpl 



Old Name EST Clone 1 



SEQ ID NOs 
Nucleotide 3 Amino Acid 



N. A. 



N.A. 



N.A, 



N.A. 



N.A. 



N.A. 



60 



62 



64 



61 



63 



65 



10 



Each EST clone is identified by its GenBank accession 
number. Each EST clone corresponds to a deposit of a 
cDNA sequence that matches a part of the nucleotide 
sequence of the corresponding SCR ortholog or paralog. 



15 



N.A. - not applicable. 



20 



The partial or complete nucleotide sequence of the SCR 
orthologs and paralogs listed here are shown in FIGS. 
5A, 8, 9, 10, 11A-B, 12A-B, 16A-M and 17 A. 

Contains the complete coding sequence of Arabidopsis SCR 
gene. 

Contains the complete amino acid sequence of Arabidopsis 
SRPa2 , SRP&8, or SCR protein. 



25 



30 



35 
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Functional equivalents of the SCR gene product 
• include any plant gene product that regulates plant embryo or 
root development, or, preferably, that regulates root cell 
division or root tissue organization, or affects gravitropism 
5 of plant aerial structures (e.g., stems and hypocotyls) . 
Functional equivalents of the SCR gene product include 
naturally occurring SCR gene products, and mutant SCR gene 
products, whether naturally occurring or engineered. 

The invention also includes nucleic acid molecules, 
10 preferably DNA molecules, that hybridize to, and are 

therefore the complements of the nucleotide sequences (a) 
through (f ) , in the first paragraph of this section. Such 
hybridization conditions may be highly stringent, less highly 
stringent, or low stringency as described above. In 
15 instances wherein the nucleic acid molecules are 

oligonucleotides ("oligos") , highly stringent conditions may 
refer, e.g., to washing in 6xSSC/0.05% sodium pyrophosphate 
at 37»c (for 14-base oligos), 48°C (for 17-base oligos), 55°c 
(for 20-base oligos), and 60*C (for 23-base oligos). These 
20 nucleic acid molecules may act as SCR antisense molecules, 
useful, for example, in SCR gene regulation and/or as 
antisense primers in amplification reactions of SCR gene 
and/ or nucleic acid sequences. Further, such sequences may 
be used as part of ribozyme and/or triple helix sequences, 
25 also useful for SCR gene regulation. Still further, such 
molecules may be used as components in probing methods 
whereby the presence of a SCARECROW allele may be detected. 

The invention also includes nucleic acid molecules, 
preferably DNA molecules, which are amplified using the 
30 polymerase chain reaction under conditions described in 
Section 5.1.1., infra, and that encode a gene product 
functionally equivalent to a SCR gene product encoded by any 
one of the genes and sequences listed in Table 1 or as 
contained in any one of the clones described herein and 
35 deposited with the ATCC. 

The invention also encompasses (a) DNA vectors that 
contain any of the foregoing gene and/ or coding sequences 
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and/or their complements (i.e., antisense or ribozyme 
molecules) ; (b) DNA expression vectors that contain any of 
the foregoing gene and/or coding sequences operatively 
associated with a regulatory element that directs the 
5 expression of the gene and/or coding sequences; and (c) 
genetically engineered host cells that contain any of the 
foregoing gene and/or coding sequences operatively associated 
with a regulatory element that directs the expression of the 
gene and/or coding sequences in the host cell. As used 

10 herein, regulatory elements include but are not limited to 
inducible and non-inducible promoters, enhancers, operators 
and other elements known to those skilled in the art that 
drive and regulate expression. 

The invention also encompasses nucleotide sequences 

15 that encode mutant SCR gene products, peptide fragments of 
the SCR gene product, truncated SCR gene products, and SCR 
fusion proteins. These gene products include , but are not 
limited to, nucleotide sequences encoding mutant SCR gene 
products; polypeptides or peptides corresponding to one or 

20 more of the Motifs I -VI as shown in FIGS. 13A-F and FIGS. 
15A-S, or the bZIP, VHIID, or leucine heptad domains of the 
SCR, or portions of these motifs and domains; truncated SCR 
gene products in which one or more of the motifs or domains 
is deleted, e.g., a truncated, nonfunctional SCR lacking all 

25 or a portion of the Motifs I -VI as shown in FIGS. 13A-F and 
FIGS. 15A-S, or the bZIP, VHIID, or leucine heptad domains of 
the SCR. Nucleotides encoding fusion proteins may include 
but are not limited to full length SCR, truncated SCR or 
peptide fragments of SCR fused to an unrelated protein or 

30 peptide, such as for example, an enzyme, fluorescent protein, 
or luminescent protein which can be used as a marker. 

In particular, the invention includes, for example, 
fragments of SCR genes encoding one or more of the following 
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domains as shown in FIG. 5E: amino acids 1-264, 265-283, 287- 
316, 410-473, 436-473, and 473-653. 

In addition to the gene and/ or coding sequences 
described above, homologous SCR genes, and other genes 
5 related by DNA sequence, may be identified and may be readily 
isolated, without undue experimentation, by molecular 
biological techniques well known in the art. More 
specifically, such homologs include, for example, paralogs 
(i.e., members of the SCR gene family occurring in the same 
10 plant) as well as orthologs (i.e., members of the SCR gene 
family which occur in a different plant species) of the 
Arabidopsis SCR gene. 

A specific embodiment of a SCR gene and coding 
sequence of the invention is Arabidopsis SCR (FIGS. 5A and 
15 5E) . Other specific embodiments include the various SCR 
genes and coding sequences listed in Table 1, supra. 

Methods for isolating SCR genes and coding 
sequences are described in detail in Section 5.2, below. 

SCR genes share substantial amino acid sequence 
20 similarities at the protein level and nucleotide sequence 
similarities in their encoding genes. The term 
"substantially similar" or "substantial similarity" when used 
herein with respect to two amino acid sequences means that 
the two sequences have at least 75% identical residues, 
25 preferably at least 85% identical residues and most 

preferably at least 95% identical residues. The same term 
when used herein with respect to two nucleotide sequences 
means that the two sequences have at least 70% identical 
residues, preferably at least 85% identical residues and most 
30 preferably at least 95% identical residues. Determining 
whether two sequences are substantially similar may be 
carried out using any methodologies known to one skilled in 
the art, preferably using computer assisted analysis. For 
example, the alignments showed herein were initially 
35 accomplished by a BLAST search (MCBI using the BLAST network 
server) . The final alignments of SCR family members were 
done manually. 
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Moreover, SCR genes show highly localized 
expression in embryos and, particularly, roots. Such 
expression patterns may be ascertained by Northern 
hybridizations and in situ hybridizations using antisense 
5 probes. 

5.1.1. ISOIATION OF SCR GENES 

The following methods can be used to obtain SCR 
genes and coding sequences from a wide variety of plants, 
10 including but not limited to Arabidopsis thaliana, Zea mays, 
Nicotiana tabacum, Daucus carota, Oryza, Glycine max, Lemna 
gibba, and Picea abies. 

Nucleotide sequences encoding an SCR gene or a 
portion thereof may be obtained by PCR amplification of plant 
15 genomic DNA or cDNA. Useful cDNA sources include "free" cDNA 
preparations (i.e., the products of cDNA synthesis) and 
cloned cDNA in cDNA libraries. Root cDNA preparations or 
libraries are particularly preferred. 

The amplification may use, as the 5'-primer (i.e., 
20 forward primer) , a degenerate oligonucleotide that 

corresponds to a segment of a known SCR amino acid sequence, 
preferably from the amino- terminal region. The 3 '-primer 
(i.e., reverse primer) may be a degenerate oligonucleotide 
that corresponds to a distal segment of the same known SCR 
25 amino acid sequence (i.e., carboxyl to the sequence that 

corresponds to the 5 '-primer). For example, the amino acid 
sequence of the Arabidopsis SCR protein (SEQ ID NO: 2) may be 
used to design useful 5' and 3' primers. Preferably, the 
primers corresponds to segments in the Motif III or VHIID 
30 domain of SCR protein (see FIGS. 13B-D and FIGS. 15K-L) . The 
sequence of the optimal degenerate oligonucleotide probe 
corresponding to a known amino acid sequence may be 
determined by standard algorithms known in the art. See for 
example, Sambrook et al., Molecular Cloning t A Laboratory 
35 Manual , 2nd ed. , Cold Spring Harbor Laboratory Press, Cold 
Spring Harbor, NY, Vol 2 (1989). 
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Further, for amplification from cDNA sources, the 
3 '-primer may be an oligonucleotide comprising an 3' 
oligo(dT) sequence. The amplification may also use as 
primers nucleotide sequences of SCR genes or coding sequences 
5 (e.g., any one of the scr sequences and EST sequences listed 
in Table 1) . 

PCR amplification can be carried out, e.g., by use 
of a Perkin-Elmer Cetus thermal cycler and Taq polymerase 
(Gene Amp") . One can choose to synthesize several different 

10 degenerate primers for use in the PCR reactions. It is also 
possible to vary the stringency of hybridization conditions 
used in priming the PCR reactions, to allow for greater or 
lesser degrees of nucleotide sequence similarity between the 
degenerate primers and the corresponding sequences in the 

15 cDNA library. One of ordinary skill in the art will know 
that the appropriate amplification conditions and parameters 
depend, in part, on the length and base composition of the 
primers and that such conditions may be determined using 
standard formulae. Protocols for executing all PCR 

20 procedures discussed herein are well known to those skilled 
in the art, and may be found in references such as Gelfand, 
19*9* PCE Technology. Principles and Applications for DMA 
Amplif jcatipn, H.A. Erlich, ed. , Stockton Press, New York; 
and Current Protocols In Molecu lar Biology . Vol. 2, Ch. 15, 

25 Ausubel et al., eds 1988, New York, Wiley & Sons, Inc. 

A PCR amplified sequence may be molecularly cloned 
and sequenced. The amplified sequence may utilized as a 
probe to isolate genomic or cDMA clones of a SCR gene, as 
described below. This, in turn, will permit the 

30 determination of a SCR gene's complete nucleotide sequence, 
including its promoter, the analysis of its expression, and 
the production of its encoded protein, as described injfra. 

In a preferred embodiment, PCR amplification of SCR 
gene and/ or coding sequences can be carried out according to 

35 the following procedure: 
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Forward: 

Name: 

A. A. code: 
5 DNA Sequence: 

Name: 

A. A. code: 
DNA Sequence: 



10 



Name: 

A. A. code: 
DNA Sequence: 



SCR5AII (23-mer, 2 inosines, 64 -mix) 

HFTANQAI 

5' CAT/C TTT/C ACI GCI AAT/C CAA/G GCN AT 3 ' 

SCR5B (29-mer, 1 inosine f 144-mix) 

VHIID(L/F)D 

5' ACGTCTCGA GTI CAT/C ATA/C/T ATA/C/T GAT/C 
TTN GA 3' 

IF 

LQCAEAV 

(T/C)TI CA(A/G) TG(T/C GCI GA (A/G) GCN GT 



Name: 

A. A. code: 
15 DNA Sequence: 

Name: 

A. A. code: 
DNA Sequence: 



20 



Name: 

A. A. code: 
DNA Sequence: 

I = inosine 
N = A/C/G/T 



SCR3AII (23-mer, 2 inosines, 128-mix) 

PGGPP(H/N/K) (V/L/F)R' 

5' CG/T CCA/C GTG/T TGG IGG ICC NCC NGG 3' 
1R 

AFQVFNGI 

AT ICC (A/G) TT (A/G) AA IAC (C/T)TG (A/G) AA NGC 
4R 

QWPGLFHI 

AT (A/G) TG (A/G) AA IA(A/G) NCC IGG CCA (C/T)TG 



25 



Useful primer combinations include the following: 
SCR5AII+SCR3AII; SCR5B+SCR3AII ; IF+IR; and IF+4R 

PCR: 



Reaction mixture (volume 50 Ml) : 

-5 Ml 10X amplification buffer containing Mg (Boehringer- 

Mannheim) 
-1 Ml 10 mM dNTP's 
30 -l Ml forward primer (stock concentration: 80 pmol/Ml) 
-1 Ml reverse primer (80 pmol/Ml) 
-DNA (100-300 ng) . 

Begin reaction with "hot start" in which the enzyme is added 
to the mix only after a brief denaturation at a high 
temperature ( 8 0 ° C ) 

35 



Cycles : 
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10 



15 



94 °C 30 sec - 

80"C 5 min - 
at 

94 °C 5 min - 
2 tines : 



2 tines: 



2 tines: 



brief denaturation (to prevent non-specific 
priming) 

apply the enzyme to the tubes (30 tubes/ round 
maximum) 

thorough denaturation 
94 °C l min 
5 min 
2 min 

1 min 
5 min 

2 min 

1 min 
5 min 

2 min 



64°C 
72°C 
94"C 
62°C 
72°C 
94°C 
60°C 
72°C 



(reduce the annealing temperature 2»C in every second round) 
until 44 °C xs reached after that: 



40 times: 



94 °C 20 sec 
48»C l min 
72 °C 2 min 



finally, let cool down to 15 °C. 



A SCR gene coding sequence may also be isolated by 
screening a plant genomic or cDNA library using a SCR 

20 nucleotide sequence (e.g., the sequence of any of the SCR 
genes and sequences and EST clone sequences listed in Table 
1.) as hybridization probe. For example, the whole or a 
segment of the Arabidopsis SCR nucleotide sequence (FIG. 5A) 
may be used. Alternatively, a SCR gene may be isolated from 

25 such libraries using as probe a degenerate oligonucleotide 
that corresponds to a segment of a SCR amino acid sequence. 
For example, degenerate oligonucleotide probe corresponding 
to a segment of the Arabidopsis SCR amino acid sequence (FIG . 
5E) may be used. 

30 In Preparation of cONA libraries, total RNA is 

isolated from plant tissues, preferably roots. Poly(A)+ rna 
is isolated from the total RNA, and cDNA prepared fron the 
poly (A) + RNA, all using standard procedures. See, for 
exanple, Sanbrook et al., Molecular clonic- a Lateaafcagy 

35 Manual, 2d ed., vol. 2 (1989). The cDNAs nay be synthesized 
with a restriction enzyme site at their 3 '-ends by using an 
appropriate primer and further have linkers or adaptors 
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attached at their 5 '-ends to facilitate the insertion of the 
cDNAs into suitable cDHA cloning vectors. Alternatively, 
adaptors or linkers may be attached to the cDNAs after the 
completion of cDNA synthesis. 
5 In preparation of genomic libraries, plant DNA is 

isolated and fragments are generated, some of which will 
encode parts of the whole SCR protein. The DNA may be 
cleaved at specific sites using various restriction enzymes. 
Alternatively, one may use DNase in the presence of manganese 
10 to fragment the DNA, or the DNA can be physically sheared, as 
for example, by sonication. The DNA fragments can then be 
separated according to size by standard techniques, including 
but not limited to, agarose and polyacrylamide gel 
electrophoresis, column chromatography and sucrose gradient 

15 centrifugation. 

The genomic DNA or cDNA fragments can be inserted 
into suitable vectors, including but not limited to, 
plasmids, cosmids, bacteriophages lambda or T 4 , and yeast 
artificial chromosome (YAC) [See, for example, Sambrook et 
20 al., Molecular cloning: A Laboratory Manual, 2d ed. , Cold 
Spring Harbor Laboratory Press, Cold Spring Harbor, New York 
(1989); Glover, D-M(ed.), DNA Cloning: A Practical Approach, 
MRL Press, Ltd., Oxford, U.K., Vols. I and II (1985)]. 

The SCR nucleotide probe, DNA or RNA, should be at 
25 least 17 nucleotides, preferably at least 26 nucleotides, and 
most preferably at least 50 nucleotides in length. The 
nucleotide probe is hybridized under moderate stringency 
conditions and washed under moderate, preferably high 
stringency conditions. Clones in libraries with insert DNA 
30 having substantial homology to the SCR probe will hybridize 
to the probe. Hybridization of the nucleotide probe to 
genomic or cDNA libraries is carried out using methods known 
in the art. One of ordinary skill in the art will know that 
the appropriate hybridization and wash conditions depend on 
35 the length and base composition of the probe and that such 
conditions may be determined using standard formulae. See, 
for example, Sambrook et al., Molecular Cloning: A Laboratory 
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Manual . 2nd ed. , Cold Spring Harbor Laboratory Press, Cold 
Spring Harbor, NY, Vol. 2, (1989) pp 11.45-11.57 and 15.55- 
15.57. 

The identity of a cloned or amplified SCR gene 
5 sequence can be verified by comparing the amino acid 

sequences of its three open reading frames with the amino 
acid sequence of a SCR gene (e.g., Arabidopsis SCR protein 
[SEQ ID No: 2 ] ) . A SCR gene or coding sequence encodes a 
protein or polypeptide whose amino acid sequence is 

10 substantially similar to that of a SCR protein or polypeptide 
(e.g., the amino acid sequence of any one of the SCR proteins 
and/or polypeptides shown in PIG. 5A, 5E, FIG. 8, PIG. 9, 
FIGS. 11A-B, FIGS. 15A-S, FIG. 17B and FIG. 18). The 
identity of the cloned or amplified SCR gene seguence may be 

15 further verified by examining its expression pattern, which 
should show highly localized expression in the embryo and/ or 
root of the plant from which the SCR gene sequence was 
isolated. 

Comparison of the amino acid sequences encoded by a 
20 cloned or amplified sequence may reveal that it does not 

contain the entire SCR gene or its promoter. in such a case 
the cloned or amplified SCR gene sequence may be used as a 
probe to screen a genomic library for clones having inserts 
that overlap the cloned or amplified SCR gene sequence. A 
25 complete SCR gene and its promoter may be reconstructed by 
splicing the overlapping SCR gene sequences. 

5.1.2. EXPRESSION OF SCR GENE PRODUCTS 

SCR proteins, polypeptides and peptide fragments, 

30 mutated, truncated or deleted forms of SCR and/or SCR fusion 
proteins can be prepared for a variety of uses, including but 
not limited to the generation of antibodies, as reagents in 
assays, the identification of other cellular gene products 
involved in regulation of root development; etc. 

35 SCR translational products include, but are not 

limited to those proteins and polypeptides encoded by the SCR 
gene sequences described in Section 5.1, above. The 
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invention encompasses proteins that are functionally 
equivalent to the SCR gene products described in Section 5.1. 
Such a SCR gene product may contain one or more deletions, 
additions or substitutions of SCR amino acid residues within 
5 the amino adid sequence encoded by any one of the SCR gene 
sequences described, above, in Section 5.1, but which result 
in a silent change, thus producing a functionally equivalent 
SCR gene product. Amino acid substitutions may be made on 
the basis of similarity in polarity, charge, solubility, 

10 hydrophobicity, hydrophilicity , and/or the amphipathic nature 
of the residues involved. 

For example, nonpolar (hydrophobic) amino acids 
include alanine, leucine, isoleucine, valine, proline, 
phenylalanine, tryptophan, and methionine; polar neutral 

15 amino acids include glycine, serine, threonine, cysteine, 
tyrosine, asparagine, and glutamine; positively charged 
(basic) amino acids include arginine, lysine, and histidine; 
and negatively charged (acidic) amino acids include aspartic 
acid and glutamic acid. "Functionally equivalent", as 

20 utilized herein, refers to a protein capable of exhibiting a 
substantially similar in vivo activity as the endogenous SCR 
gene products encoded by the SCR gene sequences described in 
Section 5.1, above. Alternatively, "functionally equivalent" 
may refer to peptides capable of regulating gene expression 

25 in a manner substantially similar to the way in which the 
corresponding portion of the endogenous SCR gene product 
would • 

The invention also encompasses mutant SCR proteins 
and polypeptides that agree not functionally equivalent to 

30 the gene products described in Section 5.1. Such a mutant 
SCR protein or polypeptide may contain one or more deletions, 
additions or substitutions of SCR amino acid residues within 
the amino acid sequence encoded by any one the SCR gene 
sequences described above in Section 5.1., and which result 

35 in loss of one or more functions of the SCR protein (e.g., 
recognition of a specific nucleic sequence, binding of an 
transcription factor, etc.), thus producing a SCR gene 
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product not functionally equivalent to the wild-type SCR 
protein. 

While random nutations can be made to SCR DNA 
(using random mutagenesis techniques well known to those 
5 skilled in the art) and the resulting mutant SCRs tested for 
activity, site-directed mutations of the SCR gene and/or 
coding sequence can be engineered (using site-directed 
mutagenesis techniques well known to those skilled in the 
art) to generate mutant SCRs with increased function, (e.g., 

XO resulting in improved root formation) , or decreased function 
(e.g., resulting in suboptimal root function). In 
particular, mutated SCR proteins in which any of the domains 
shown in figs. 13A-F are deleted or mutated are within the 
scope of the invention. Additionally, peptides corresponding 

15 to one or more domains of the SCR (e.g., shown in FIGS. 13A- 
F), truncated or deleted SCRs, as well as fusion proteins in 
which the full length SCR, a SCR polypeptide or peptide fused 
to an unrelated protein are also within the scope of the 
invention and can be designed on the basis of the SCR 

20 nucleotide and SCR amino acid sequences disclosed in Section 
5.1. above . 

While the SCR polypeptides and peptides can be 
chemically synthesized (e.g., see Creighton, 1983, Proteins: 
Structures and Molecular Principles, W.H. Freeman & Co., 

25 N.Y.) large polypeptides derived from SCR and the full length 
SCR may advantageously be produced by recombinant DNA 
technology using techniques well known to those skilled in 
the art for expressing nucleic acid sequences. 

Methods which are well known to those skilled in 

30 the art can be used to construct expression vectors 

containing SCR protein coding sequences and appropriate 
transcriptional/translational control signals. These methods 
include, for example, in vitro recombinant DNA techniques, 
synthetic techniques and in vivo recombination/ genetic 

35 recombination. See, for example, the techniques described in 
Sambrook et al., 1989, supra, and Ausubel et al., 1989, 
supra. Alternatively, rna capable of encoding SCR protein 
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sequences may be chemically synthesized using, for example, 
synthesizers* See, for example, the techniques described in 
"Oligonucleotide Synthesis", 1984, Gait, M.J. ed. , IRL Press, 
Oxford* 

5 A variety of host-expression vector systems may be 

utilized to express the SCR gene products of the invention. 
Such host-expression systems represent vehicles by which the 
SCR gene products of interest may be produced and 
subsequently recovered and /or purified from the culture or 
10 plant (using purification methods well known to those skilled 
in the art) , but also represent cells which may, when 
transformed or transfected with the appropriate nucleotide 
coding sequences, exhibit the SCR protein of the invention in 
situ. These include but are not limited to microorganisms 
15 such as bacteria (e.g., E* coli, B . subtilis) transformed 

with recombinant bacteriophage DNA, plasmid DNA or cosmid DNA 
expression vectors containing SCR protein coding sequences; 
yeast (e.g., Saccharomyces , Pichia) transformed with 
recombinant yeast expression vectors containing the SCR 
20 protein coding sequences; insect cell systems infected with 
recombinant virus expression vectors (e.g., baculovirus) 
containing the SCR protein coding sequences; plant cell 
systems infected with recombinant virus expression vectors 
(e.g., cauliflower mosaic virus, CaMV; tobacco mosaic virus, 
25 TMV) or transformed with recombinant plasmid expression 
vectors (e.g., Ti plasmid) containing SCR protein coding 
sequences; or mammalian cell systems (e.g., COS, CHO, BHK, 
293, 3T3) harboring recombinant expression constructs 
containing promoters derived from the genome of mammalian 
30 cells (e.g., metal lothionein promoter) or from mammalian 
viruses (e.g., the adenovirus late promoter; the vaccinia 
virus 7.5K promoter; the cytomegalovirus promoter /enhancer; 
etc . ) . 

In bacterial systems, a number of expression 
35 vectors may be advantageously selected depending upon the use 
intended for the SCR protein being expressed. For example, 
when a large quantity of such a protein is to be produced, 
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for the generation of antibodies or to screen peptide 
libraries, for example, vectors which direct the expression 
of high levels of fusion protein products that are readily 
purified nay be desirable. Such vectors include, but are not 
5 limited, to the E. coli expression vector pUR278 (Ruther et 
al., 1983, EMBO J. 2:1791), in which the SCR coding sequence 
may be ligated individually into the vector in frame with the 
lac z coding region so that a fusion protein is produced; p in 
vectors (Inouye & Inouye, 1985, Nucleic Acids Res. 13:3101- 

10 3109; Van Heeke & Schuster, 1989, J. Biol. chem. 264:5503- 
5509) ; and the like. pGEX vectors may also be used to 
express foreign polypeptides as fusion proteins with gluta- 
thione S-transf erase (GST). In general, such fusion proteins 
are soluble and can easily be purified from lysed cells by 

15 adsorption to glutathione-agarose beads followed by elution 
in the presence of free glutathione. The pGEX vectors are 
designed to include thrombin or factor Xa protease cleavage 
sites so that the cloned target gene protein can be released 
from the GST moiety. 

20 In one sut =h embodiment of a bacterial system, full 

length cDNA sequences are appended with in-frame Bam HI sites 
at the amino terminus and Eco RI sites at the carboxyl 
terminus using standard PCR methodologies (Innis et al., 
1990, supra) and ligated into the pGEX-2TK vector (Pharmacia, 
25 Uppsala, Sweden). The resulting cDNA construct contains a 
kinase recognition site at the amino terminus for radioactive 
labelling and glutathione s-transf erase sequences at the 
carboxyl terminus for affinity purification (Nilsson, et al., 
1985, EMBO J. 4: 1075; Zabeau and Stanley, 1982, EMBO J. U 



30 1217. 



The recombinant constructs of the present invention 
may include a selectable marker for propagation of the 
construct. For example, a construct to be propagated in 
bacteria preferably contains an antibiotic resistance gene 
35 such as one that confers resistance to kanamycin, 

tetracycline, streptomycin, or chloramphenicol. Suitable 
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vectors for propagating the construct include plasmids, 
cosmids , bacteriophages or viruses , to name but a few . 

In addition, the recombinant constructs may include 
plant-expressible, selectable, or screenable marker genes for 
5 isolating, identifying or tracking plant cells transformed by 
these constructs. Selectable markers include, but are not 
limited to, genes that confer antibiotic resistance, (e.g., 
resistance to kanamycin or hygromycin) or herbicide 
resistance (e.g., resistance to sulfonylurea, 

10 phosphinothricin, or glyphosate) . Screenable markers 
include, but are not be limited to, genes encoding B- 
glucuronidase (Jefferson, 1987, Plant Mol. Biol. Rep. 5:387- 
405), luciferase (Ow et al., 1986, Science 234:856-859), B 
protein that regulates anthocyanin pigment production (Goff 

15 et al., 1990, EMBO J 9:2517-2522). 

In embodiments of the present invention which 
utilize the Agrobact&rlum tume-facien system for transforming 
plants (see infra) , the recombinant constructs may 
additionally comprise at least the right T-DNA border 

2 0 sequences flanking the DNA sequences to be transformed into 
the plant cell. Alternatively, the recombinant constructs 
may comprise the right and left T-DNA border sequences 
flanking the DNA sequence. The proper design and 
construction of such T-DNA based transformation vectors are 

25 well known to those skilled in the art. 

5.1.3. ANTIBODIES TO SCR P ROTEINS AND POLYPEPTIDES 
Antibodies that specifically recognize one or more 
epitopes of SCR, or epitopes of conserved variants of SCR, or 

30 peptide fragments of the SCR are also encompassed by the 
invention. Such antibodies include but are not limited to 
polyclonal antibodies, monoclonal antibodies (mAbs) , 
humanized or chimeric antibodies, single chain antibodies, 
Fab fragments, F(ab') 2 fragments, fragments produced by a Fab 

35 expression library, anti-idiotypic (anti-Id) antibodies, and 
epitope-binding fragments of any of the above. 
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For the production of antibodies, various host 
animals may be immunized by injection with the SCR protein, 
an SCR peptide (e.g., one corresponding to a functional 
domain of the protein) , a truncated SCR polypeptide (SCR in 
5 which one or more domains has been deleted) , functional 
equivalents of the SCR protein, or mutants of the SCR 
protein. Such SCR proteins, polypeptides, peptides or fusion 
proteins can be prepared and obtained as described in Section 
5.1.2. supra. Host animals may include but are not limited 
XO to rabbits, mice, and rats, to name but a few. Various 

adjuvants may be used to increase the immunological response, 
depending on the host species, including but not limited to 
Freund's (complete and incomplete), mineral gels such as 
aluminum hydroxide, surface active substances such as 
IS lysolecithin, pluronic polyols, polyanions, peptides, oil 
emulsions, keyhole limpet hemocyanin, dinitrophenol , and 
potentially useful human adjuvants such as BCG (bacille 
Calmette-Guerin) and Coryn&bacterium parvum. Polyclonal 
antibodies are heterogeneous populations of antibody 
20 molecules derived from the sera of the immunized animals. 

Monoclonal antibodies, which are homogeneous 
populations of antibodies to a particular antigen, may be 
obtained by any technique which provides for the production 
of antibody molecules by continuous cell lines in culture. 
25 These include, but are not limited to, the hybridoma 

technique of Kohler and Milstein, (Nature 256:495-497 [1975); 
and U.S. Patent No. 4,376,110), the human B-cell hybridoma 
technique (Kosbor et al., 1983, Immunology Today 4:72; Cole 
et al., 1983, Proc. Natl. Acad. Sci. USA 80:2026-2030), and 
30 the EBV-hybridoma technique (Cole et al., 1985, Monoclonal 
Antibodies And Cancer Therapy, Alan R. Liss, Inc., pp. 77- 
96) . Such antibodies may be of any immunoglobulin class 
including IgG, IgM, igE, igA, IgD and any subclass thereof. 
The hybridoma producing the mAb of this invention may be 
35 cultivated in vitro or in vivo. Production of high titers of 
mAbs in vivo makes this the presently preferred method of 
production. 
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In addition, techniques developed for the 
production of "chimeric antibodies 11 (Morrison et al., 1984, 
Proc. Natl. Acad* Sci., 81:6851-6855; Neuberger et al., 1984, 
Nature, 312:604-608; Takeda et al., 1985, Nature, 314:452- 
5 454) by splicing the genes from a mouse antibody molecule of 
appropriate antigen specificity together with genes from a 
human antibody molecule of appropriate biological activity 
can be used. A chimeric antibody is a molecule in which 
different portions are derived from different animal species, 

10 such as those having a variable region derived from a murine 
mAb and a human immunoglobulin constant region. 

In addition, techniques have been developed for the 
production of humanized antibodies. (See, e.g., Queen, U.S. 
Patent No. 5,585,089.) An immunoglobulin light or heavy 

15 chain variable region consists of a "framework" region 

interrupted by three hypervariable regions, referred to as 
complementarily determining regions (CDRs) . The extent of 
the framework' region and CDRs have been precisely defined 
(see, "Sequences of Proteins of Immunological Interest", 

20 Kabat, E. et al., U.S. Department of Health and Human 

Services (1983) . Briefly, humanized antibodies are antibody 
molecules from non-human species having one or more CDRs from 
the non-human species and a framework region from a human 
immunoglobulin molecule. 

25 Alternatively, techniques described for the 

production of single chain antibodies (U.S. Patent 4,946,778; 
Bird, 1988, Science 242:423-426; Huston et al. , 1988, Proc. 
Natl. Acad. Sci. USA 85:5879-5883; and Ward et al., 1989, 
Nature 334:544-546) can be adapted to produce single chain 

30 antibodies against SCR proteins or polypeptides. Single 
chain antibodies are formed by linking the heavy and light 
chain fragments of the Fv region via an amino acid bridge, 
resulting in a single chain polypeptide. 

Antibody fragments which recognize specific 
35 epitopes may be generated by known techniques. For example, 
such fragments include but are not limited to: the F(ab') 2 
fragments which can be produced by pepsin digestion of the 
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antibody molecule and the Fab fragments which can be 
-generated by reducing the disulfide bridges of the F(ab') 2 
fragments. Alternatively, Fab expression libraries may be 
constructed (Huseetal., 1989, Science, 246:1275-1281) to 
5 allow rapid and easy identification of monoclonal Fab 
fragments with the desired specificity. 

Antibodies to a SCR protein and/or polypeptide can, 
in turn, be utilized to generate anti-idiotype antibodies 
that "mimic" SCR, using techniques well known to those 
10 skilled in the art. (See, e.g., Greenspan & Bona, 1993, 
FASEB J 7(5) : 437-444; and Nissinoff, 1991, J. Immunol. 
147(8) :2429-2438) . 



5.1.4. SCR GENE OR GENE PRODUCTS AS 

15 MARKERS FOR QUALITATIVE TRATT unqj 

Any of the nucleotide sequences (including EST 
clone sequences) described in SS 5.1 and 5.1.1. and/or listed 
in Table l, and/or polypeptides and proteins described in 
SS 5.1.2. and/or listed in Table l, can be used as markers 

20 for qualitative trait loci in breeding programs for crop 
plants. To this end, the nucleic acid molecules, including 
but not limited to full length SCR coding sequences, and/or 
partial sequences (ESTs) , can be used in hybridization and/or 
DNA amplification assays to identify the endogenous SCR 

25 genes, scr mutant alleles and/ or SCR expression products in 
cultivars as compared to wild-type plants. They can also be 
used as markers for linkage analysis of qualitative trait 
loci. It is also possible that the SCR gene may encode a 
product responsible for a qualitative trait that is desirable 

30 in a crop breeding program. Alternatively, the SCR protein, 
peptides and/ or antibodies can be used as reagents in 
immunoassays to detect expression of the SCR gene in 
cultivars and wild-type plants. 



35 
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5.2. SCR PROMOTERS 

According to the present invention, SCR promoters 
and functional portions thereof described herein refer to 
regions of the SCR gene which are capable of promoting 
5 tissue-specific expression in embryos and/or roots of an 
operably linked coding sequence in plants. The SCR promoter 
described herein refers to the regulatory elements of SCR 
genes, i.e., regulatory regions of genes which are capable of 
selectively hybridizing to the nucleic acids described in 
10 Section 5.1, or regulatory sequences contained, for example, 
in the region between the translational start site of the 
Arabidopsis SCR gene and the Hindlll site approximately 2.5 
kb upstream of the site in plasmid pLIGl-3/SAC+Mob21SAC (see 
FIGS. 5A and 14) in hybridization assays, or which are 
15 homologous by sequence analysis (containing a span of 10 or 
more nucleotides in which at least 50 percent of the 
nucleotides are identical to the sequences presented herein) . 
Homologous nucleotide sequences refer to nucleotide sequences 
including, but not limited to, SCR promoters in diverse plant 
20 species (e.g., promoters of orthologs of Arabidopsis SCR) as 
well as genetically engineered derivatives of the promoters 
described herein. 

Methods which could be used for the synthesis, 
isolation, molecular cloning, characterization and 
25 manipulation of SCR promoter sequences are well known to 
those skilled in the art. See, e.g., the techniques 
described in Sambrook et al., Molecular Cloning: A 
Laboratory Manual, 2nd. ed. , Cold Spring Harbor Laboratory, 
Cold Spring Harbor, New York (1989). 
30 According to the present invention, SCR promoter 

sequences or portions thereof described herein may be 
obtained from appropriate plant or mammalian sources from 
cell lines or recombinant DNA constructs containing SCR 
promoter sequences, and/or by chemical synthetic methods. 
35 SCR promoter sequences can be obtained from genomic clones 
containing sequences 5' upstream of SCR coding sequences. 
Such 5' upstream clones may be obtained by screening genomic 
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libraries using SCR protein coding sequences, particularly 
those encoding SCR N-terminal sequences, from SCR gene clones 
obtained as described in Sections 5.1. and 5.2. Standard 
methods that may used in such screening include, for example, 
5 the method set forth in Benton & Davis, 1977, Science 196:180 
for bacteriophage libraries; and Grunstein & Hogness, 1975, 
Proc. Nat. Acad. Sci. U.S.A. 72:3961-3965 for plasmid 
libraries. 

The full extent and location of SCR promoters 
10 within such 5' upstream clones may be determined by the 

functional assay described below. In the event a 5' upstream 
clone does not contain the entire SCR promoter as determined 
by the functional assay, the insert DMA of the clone may be 
used to isolate genomic clones containing sequences further 
15 5' upstream of the SCR coding sequences. Such further 

upstream sequences can be spliced on to existing 5' upstream 
sequences and the reconstructed 5' upstream region tested for 
functionality as a SCR promoter (i.e., promoting tissue- 
specific expression in embryos and/ or roots of an operably 
20 linked gene in plants) . This process may be repeat until the 
complete SCR promoter is obtained. 

The location of the SCR promoter within genomic 
sequences 5' upstream of the SCR gene isolated as described 
above may be determined using any method known in the art. 
25 For example, the 3 '-end of the promoter may be identified by 
locating the transcription initiation site, which may be 
determined by methods such as RNase protection (e.g., Liang 
et al., 1989, J. Biol. Chem. 264:14486-14498), primer 
extension (e.g., Weissenborn & Larson, 1992, J. Biol. Chem. 
30 267:6122-6131), and/or reverse transcriptase/PCR. The 

location of the 3 '-end of the promoter may be confirmed by 
sequencing and computer analysis, examining for the canonical 
AGGA or TATA boxes of promoters that are typically 50-60 base 
pairs (bp) and 25-35 bp 5 '-upstream of the transcription 
35 initiation site. The 5 '-end promoter may be defined by 

deleting sequences from the 5'-end of the promoter containing 
fragment, constructing a transcriptional or translational 
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fusion of the resected fragment and a reporter gene, and 
examining the expression characteristics of the chimeric gene 
in transgenic plants. Reporter genes that may be used to 
such ends include, but are not limited to, GUS # CAT, 
5 lucif erase, 0-galactosidase and CI and R gene controlling 
anthocyanin production. 

According to the present invention, a SCR promoter 
is one that confers to an operably linked gene in a 
transgenic plant tissue-specific expression in roots, root 

10 nodules, stems and/or embryos. A SCR promoter comprises the 
region between about -5,000 bp and +1 bp upstream of the 
transcription initiation site of SCR gene. In a particular 
embodiment, the Arabidopsis SCR promoter comprises the region 
between positions -2.5 kb and +1 in the 5' upstream region of 

15 the Arabidopsis SCR gene (see FIGS. 5A and 14). 

5.2.1. CIS -REGULATORY ELEMENTS OF SCR PROMOTERS 
According to the present invention, the cis- 
regulatory elements within a SCR promoter may be identified 

20 using any method known in the art. For example, the location 
of cis-regulatory elements within an inducible promoter may 
be identified using methods such as DNase or chemical 
footprinting (e.g., Meier et al. , 1991, Plant Cell 3:309-315) 
or gel retardation (e.g., Weissenborn & Larson, 1992, J. 

25 Biol. Chem. 267-6122-6131; Beato, 1989, Cell 56:335-344; 
Johnson et al., 1989, Ann. Rev. Biochem. 58:799-839). 
Additionally, resectioning experiments may also be employed 
to define the location of the cis-regulatory elements. For 
example, an inducible promoter-containing fragment may be 

30 resected from either the 5' or 3 '-end using restriction 
enzyme or exonuclease digests. 

To determine the location of cis-regulatory 
elements within the sequence containing the inducible 
promoter, the 5'- or 3 '-resected fragments, internal 
35 fragments to the inducible promoter containing sequence, or 
inducible promoter fragments containing sequences identified 
by footprinting or gel retardation experiments may be fused 
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to the 5 '-end of a truncated plant promoter, and the activity 
of the chimeric promoter in transgenic plant examined. 
Useful truncated promoters to these ends comprise sequences 
starting at or about the transcription initiation site and 
5 extending to no more than 150 bp 5' upstream. These 
truncated promoters generally are inactive or are only 
minimally active. Examples of such truncated plant promoters 
may include, among others, a "minimal" CaMV 35S promoter 
whose 5' end terminates at position -46 bp with respect to 

10 the transcription initiation site (Skriver et al., Proc. 
Natl. Acad. Sci. USA 88:7266-7270); the truncated "-90 35S" 
promoter in the X-GUS-90 vector (Benfey & Chua, 1989, Science 
244:174-181); a truncated "-101 nos" promoter derived from 
the nopaline synthase promoter (Aryan et al., 1991, Mol. Gen. 

15 Genet. 225:65-71); and the truncated maize Adh-l promoter in 
pADcat 2 (Ellis et al. , 1987, EMBO J. 6:11-16). 

According to the present invention, a cis- 
regulatory element of a SCR promoter is a sequence that 
confers to a truncated promoter tissue-specific expression in 

20 embryos, stems, root nodules and/ or roots. 

5.2.2. SCR PROMOTER— DRIV EN EXPRESSION VECTORS 
The properties of the nucleic acid sequences are 
varied as are the genetic structures of various potential 

25 host plant cells. In the preferred embodiments of the 
present invention, described herein, a number of features 
which an artisan may recognize as not being absolutely 
essential, but clearly advantageous are used. These include 
methods of isolation, synthesis or construction of gene 

30 constructs, the manipulation of the gene constructs to be 
introduced into plant cells, certain features of the gene 
constructs, and certain features of the vectors associated 
with the gene constructs. 

Further, the gene constructs of the present 

35 invention may be encoded on DNA or RNA molecules. According 
to the present invention, it is preferred that the desired, 
stable genotypic change of the target plant be effected 
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through genomic integration of exogenously introduced nucleic 
acid construct (s) , particularly recombinant DNA constructs. 
Nonetheless, according to the present invention, such 
genotypic changes can also be effected by the introduction of 
5 episomes (DNA or RNA) that can replicate autonomously and 
that are somatically and germinally stable. Where the 
introduced nucleic acid constructs comprise RNA, plant 
transformation or gene expression from such constructs may 
proceed through a DNA intermediate produced by reverse 

10 transcription. 

The present invention provides for use of 
recombinant DNA constructs which contain tissue-specific and 
developmental -specific promoter fragments and functional 
portions thereof. As used herein, a functional portion of a 

15 SCR promoter is capable of functioning as a tissue-specific 
promoter in the embryo, stem, root nodule and/ or root of a 
plant. The functionality of such sequences can be readily 
established by any method known in the art. Such methods 
include, for example, constructing expression vectors with 

20 such sequences and determining whether they confer tissue- 
specific expression in the embryo, stem, root nodule and/ or 
root to an operably linked gene. In a particular embodiment, 
the invention provides for the use of the Arabidopsis SCR 
promoter contained in the sequences depicted in FIGS. 5A and 

25 14 and the insert DNA of plasmid pGEX-2TK*. 

The SCR promoters of the invention may be used to 
direct the expression of any desired protein, or to direct 
the expression of a RNA product, including, but not limited 
to, an "antisense" RNA or ribozyme. Such recombinant 

30 constructs generally comprise a native SCR promoter or a 
recombinant SCR promoter derived therefrom, ligated to the 
nucleic acid sequence encoding a desired heterologous gene 
product . 

A recombinant SCR promoter is used herein to refer 
35 to a promoter that comprises a functional portion of a native 
SCR promoter or a promoter that contains native promoter 
sequences that is modified by a regulatory element from a SCR 
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promoter. Alternatively, a recombinant inducible promoter 
derived from the scr promoter may be a chimeric promoter, 
comprising a full-length or truncated plant promoter modified 
by the attachment of one or more SCR cis-regulatory elements. 
5 The manner of chimeric promoter constructions may 

be any well known in the art. For examples of approaches 
that can be used in such constructions, see Section 5.1.2., 
above and Fluhr et al., 1986, Science 232: 1106-1112; Ellis et 
al., 1987, EMBO J. 6:11-16; Strittmatter & Chua, 1987, Proc. 
10 Natl. Acad. Sci. USA 84:8986-8990; Poulsen & Chua, 1988, Mol. 
Gen. Genet. 214:16-23; Comai et al. , 1991, Plant Mol. Biol. 
15:373-381; Aryan et al., 1991, Mol. Gen. Genet. 225:65-71. 

According to the present invention, where a SCR 
promoter or a recombinant SCR promoter is used to express a 
15 desired protein, the DNA construct is designed so that the 
protein coding sequence is ligated in phase with the 
translational initiation codon downstream of the promoter. 
Where the promoter fragment is missing 5 'leader seguences, a 
DMA fragment encoding both the protein and its 5' RNA leader 
20 sequence is ligated immediately downstream of the 

transcription initiation site. Alternatively, an unrelated 
5' RNA leader sequence may be used to bridge the promoter and 
the protein coding sequence. in such instances, the design 
should be such that the protein coding sequence is ligated in 
25 phase with the initiation codon present in the leader 
sequence, or ligated such that no initiation codon is 
interposed between the transcription initiation site and the 
first methionine codon of the protein. 

Further, it may be desirable to include additional 
30 DNA sequences in the protein expression constructs. Examples 
of additional DNA sequences include, but are not limited to, 
those encoding: a 3' untranslated region; a transcription 
termination and polyadenylation signal; an intron; a signal 
peptide (which facilitates the secretion of the protein) ; or 
35 a transit peptide (which targets the protein to a particular 
cellular compartment such as the nucleus, chloroplast, 
mitochondria, or vacuole) . 
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5-3. PRODUCTION OF TRANSGENIC PLANTS AND PLANT 

CELLS 

According to the present invention , a desirable 
plant or plant cell nay be obtained by transforming a plant 
cell with the nucleic acid constructs described herein. In 
some instances, it may be desirable to engineer a plant or 
plant cell with several different gene constructs- Such 
engineering may be accomplished by transforming a plant or 
plant cell with all of the desired gene constructs 
simultaneously- Alternatively, the engineering may be 
carried out sequentially. That is, transforming with one 
gene construct, obtaining the desired transformant after 
selection and screening, transforming the transformant with a 
second gene construct, and so on. 

In an embodiment of the present invention, 
Agrobacterium is employed to introduce the gene constructs 
into plants. Such transformations preferably use binary 
AgroJbacterium T-DNA vectors (Bevan, 1984, Nuc. Acid Res. 
12:8711-8721), and the co-cultivation procedure (Horsch et 
al., 1985, Science 227:1229-1231). Generally, the 
Agrobacteriujn transformation system is used to engineer 
dicotyledonous plants (Bevan et al. , 1982, Ann. Rev. Genet. 
16:357-384; Rogers et al., 1986, Methods Enzymol. 118:627- 
641) - The AgrroJbacterium transformation system may also be 
used to transform, as well as transfer, DNA to 
monocotyledonous plants and plant cells (see Hernalsteen et 
al., 1984, EMBO J 3:3039-3041; Hooykass-Van Slogteren et al- , 
1984, Nature 311:763-7 64; Grimsley et al., 1987, Nature 
325:1677-179; Boulton et al., 1989, Plant Mol. Biol. 12:31- 
40.; Gould et al., 1991, Plant Physiol. 95:426-434). 

In other embodiments, various alternative methods 
for introducing recombinant nucleic acid constructs into 
plants and plant cells may also be utilized. These other 
methods are particularly useful where the target is a 
35 monocotyledonous plant or plant cell. Alternative gene 
transfer and transformation methods include, but are not 
limited to, protoplast transformation through calcium-, 
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polyethylene glycol (PEG)- or electroporation-mediated uptake 
of naked DNA (see Paszkowski et al., 1984, EMBO J 3:2717- 
2722, Potrykus et al. , 1985, Mol. Gen. Genet. 199:169-177; 
Fromm et al. , 1985, Proc. Natl. Acad. sci. USA 82:5824-5828; 
5 Shimamoto, 1989, Nature 338:274-276), and electroporation of 
plant tissues (D'Halluin et al., 1992, Plant Cell 4:1495- 
1505) . Additional methods for plant cell transformation 
include microinjection, silicon carbide mediated DNA uptake 
(Kaeppler et al., 1990, Plant Cell Reporter 9:415-418), and 
10 microprojectile bombardment (see Klein et al., 1988, Proc. 
Natl. Acad. Sci. USA 85:4305-4309; Gordon-Kamm et al., 1990, 
Plant Cell 2:603-618). 

According to the present invention, a wide variety 
of plants may be engineered for the desired physiological and 
15 agronomic characteristics described herein using the nucleic 
acid constructs of the instant invention and the various 
transformation methods mentioned above. In preferred 
embodiments, target plants for engineering include, but are 
not limited to, crop plants such as maize, wheat, rice, 
20 soybean, tomato, tobacco, carrots, peanut, potato, sugar 
beets, sunflower, yam, Arabidopsis, rape seed, and petunia; 
and trees such as spruce. 

According to the present invention, desired plants 
and plant cells may be obtained by engineering the gene 
25 constructs described herein into a variety of plant cell 
types, including but not limited to, protoplasts, tissue 
culture cells, tissue and organ explants, pollen, embryos as 
well as whole plants. in an embodiment of the present 
invention, the engineered plant material is selected or 
30 screened for transformants (i.e., those that have 

incorporated or integrated the introduced gene construct (s) ) 
following the approaches and methods described below. An 
isolated transformant may then be regenerated into a plant. 
Alternatively, the engineered plant material may be 
35 regenerated into a plant, or plantlet, before subjecting the 
derived plant, or plantlet, to selection or screening for the 
marker gene traits. Procedures for regenerating plants from 
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plant cells, tissues or organs, either before or after 
. selecting or screening for marker gene(s), are well known to 
those skilled in the art* 

A transformed plant cell, callus, tissue or plant 
5 may be identified and isolated by selecting or screening the 
engineered plant material for traits encoded by the marker 
genes present on the transforming DNA. For instance, 
selection may be performed by growing the engineered plant 
material on media containing inhibitory amounts of the 
10 antibiotic or herbicide to which the transforming marker gene 
construct confers resistance. Further, transformed plants 
and plant cells may also be identified by screening for the 
activities of any visible marker genes (e.g., the fi- 
glucuronidase, lucif erase, B or CI genes) that may be present 
15 on the recombinant nucleic acid constructs of the present 
invention. Such selection and screening methodologies are 
well known to those skilled in the art. 

Physical and biochemical methods may also be used 
to identify a plant or plant cell transformant containing the 
20 gene constructs of the present invention. These methods 
include but are not limited to: 1) Southern analysis or PGR 
amplification for detecting and determining the structure of 
the recombinant DNA insert; 2) Northern blot, S-l RNase 
protection, primer-extension or reverse transcriptase-PCR 
25 amplification for detecting and examining RNA transcripts of 
the gene constructs; 3) enzymatic assays for detecting enzyme 
or ribozyme activity, where such gene products are encoded by 
the gene construct; 4) protein gel electrophoresis, western 
blot techniques, immunoprecipitation, or enzyme-linked 
30 immunoassays, where the gene construct products are proteins; 
5) biochemical measurements of compounds produced as a 
consequence of the expression of the introduced gene 
constructs. Additional techniques, such as in situ 
hybridization, enzyme staining, and immunostaining, may also 
35 be used to detect the presence or expression of the 

recombinant construct in specific plant organs and tissues. 
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The methods for doing all these assays are well known to 
those skilled in the art. 

5.3.1. TRANSGENIC PLANTS THAT ECTOP I CALL Y 
EXPRESS SCR 

In accordance to the present invention, a plant 
that expresses a recombinant SCR gene may be engineered by 
transforming a plant cell with a gene construct comprising a 
plant promoter operably associated with a seguence encoding 
SCR protein or a fragment thereof. (Operably associated is 
used herein to mean that transcription controlled by the 
"associated" promoter would produce a functional messenger 
RNA, whose translation would produce the enzyme.) The plant 
promoter may be constitutive or inducible. Useful 
constitutive promoters include, but are not limited to, the 
CaMV 35S promoter, the T-DNA mannopine synthetase promoter, 
and their various derivatives. Useful inducible promoters 
include but are not limited to the promoters of ribulose 
bisphosphate carboxylase (RUBISCO) genes, chlorophyll a/b 
binding protein (CAB) genes, heat shock genes, the defense 
responsive gene (e.g., phenylalanine ammonia lyase genes), 
wound induced genes (e.g., hydroxyproline rich cell wall 
protein genes), chemically-inducible genes (e.g., nitrate 
reductase genes, gluconase genes, chitinase genes, PR-i genes 
etc.), dark-inducible genes (e.g., asparagine synthetase gene 
(Coruzzi and Tsai, U.S. Patent 5,256,558, October 26, 1993, 
Gene Encoding Plant Asparagine Synthetase) development ally 
regulated genes (e.g., shoot Meristemless gene) to name just 
a few. 

In yet another embodiment of the present invention, 
it may be advantageous to transform a plant with a gene 
construct operably linking a modified or artificial promoter 
to a seguence encoding SCR protein or a fragment thereof. 
Typically, such promoters, constructed by recombining 
structural elements of different promoters, have unigue 
expression patterns and/or levels not found in natural 
promoters. See, e.g., Salina et al., 1992, Plant Cell 
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4:1485-1493, for examples of artificial promoters constructed 
from combining cis-regulatory elements with a promoter core. 

In a preferred embodiment of the present invention , 
the associated promoter is a strong and root, root nodule, 
5 stem and/ or embryo-specific plant promoter such that the SCR 
protein is overexpressed in the transgenic plant* Examples 
of root- and root nodules-specific promoters include but are 
not limited to the promoters of SCR genes, SHR genes, 
legehemoglobin genes, nodulin genes and root-specific 
10 glutamine synthetase genes (See e.g., Tingey et al., 1987, 
EMBO J. 6:1-9; Edwards et al., 1990, Proc. Nat. Acad. Sci. 
USA 87:3459-3463) . 

In yet another preferred embodiment of the present 
invention, the overexpression of SCR protein in roots may be 
15 engineered by increasing the copy number of the SCR gene. 
One approach to producing such transgenic plants is to 
transform with nucleic acid constructs that contain multiple 
copies of the complete SCR gene (i.e., with its own native 
scr promoter) . Another approach is repeatedly transform 
20 successive generations of a plant line with one or more 

copies of the complete SCR gene. Yet another approach is to 
place a complete SCR gene in a nucleic acid construct 
containing an amplification-selectable marker (ASM) gene such 
as the glutamine synthetase or dihydrof olate reductase gene. 
25 Cells transformed with such constructs is subjected to 
culturing regimes that select cell lines with increased 
copies of complete SCR gene. See, e.g., Donn et al., 1984, 
J. Mol. Appl. Genet. 2:549-562, for a selection protocol used 
to isolate of a plant cell line containing amplified copies 
30 of the GS gene. Because the desired gene is closely linked 
to the ASM, cell lines that amplified the ASM gene are also 
likely to have amplified the SCJ* gene. Cell lines with 
amplified copies of the SCR gene can then be regenerated into 
transgenic plants. 
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5.3.2. TRANSGENIC PLANTS THAT SUPPRESS 
ENDOGENOUS SCR EXPRESSION 

In accordance with the present invention , a desired 
plant may be engineered by suppressing SCR activity. In one 
5 embodiment, the suppression may be engineered by transforming 
a plant with a gene construct encoding an antisense RNA or 
ribozyme complementary to a segment or the whole of SCR RNA 
transcript, including the mature target mRNA. In another 
embodiment, SCR gene suppression may be engineered by 
10 transforming a plant cell with a gene construct encoding a 
ribozyme that cleaves the SCR mRNA transcript. 
Alternatively, the plant can be engineered, e.g., via 
targeted homologous recombination to inactive or "knock-out" 
expression of the plant's endogenous SCR. 
* 5 For all of the aforementioned suppression 

constructs, it is preferred that such gene constructs express 
specifically in the root, root nodule, stem and/ or embryo 
tissues. Alternatively, it may be preferred to have the 
suppression constructs expressed constitutively • Thus, 
20 constitutive promoters, such as the nopaline, CaMV 35S 
promoter, may also be used to express the suppression 
constructs. A most preferred promoter for these suppression 
constructs is a SCR or SHR promoter. 

In accordance with the present invention, desired 
25 plants with suppressed target gene expression may also be 

engineered by transforming a plant cell with a co-suppression 
construct. A co-suppression construct comprises a functional 
promoter operatively associated with a complete or partial 
SCR gene sequence. It is preferred that the operatively 
30 associated promoter be a strong, constitutive promoter, such 
as the CaMV 35S promoter. Alternatively, the co-suppression 
construct promoter can be one that expresses with the same 
tissue and developmental specificity as the scr gene. 

According to the present invention, it is preferred 
35 that the co-suppression construct encodes a incomplete SCR 
mRNA, although a construct encoding a fully functional SCR 
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joRNA or enzyme nay also be useful in effecting co- 
" suppr e s s ion • 

In accordance with the present invention, desired 
plants with suppressed target gene expression may also be 
5 engineered by transforming a plant cell with a construct that 
can effect site-directed mutagenesis of the SCR gene. (See, 
e.g., Offringa et al., 1990, EMBO J. 9:3077-84; and Kanevskii 
et al., 1990, Dokl. AJcad. Nauk. SSSR 312:1505-1507) for 
discussions of nucleic constructs for effecting site-directed 
10 mutagenesis of target genes in plants.) It is preferred that 
such constructs effect suppression of SCR gene by replacing 
the endogenous SCR gene sequence through homologous 
recombination with none or inactive SCR protein coding 
sequence . 

15 

5.3.3. TRANSGENIC PLANTS THAT EXPRESS A 

TRANSGENE CONTROLLED BY THE SCR PROMOTER 

In accordance with the present invention, a desired 
plant may be engineered to express a gene of interest under 

20 the control of the SCR promoter. SCR promoters and 

functional portions thereof refer to regions of the nucleic 
acid sequence which are capable of promoting tissue-specific 
transcription of an operably linked gene of interest in the 
embryo, stem, root nodule and/or root of a plant. The SCR 

25 promoter described herein refers to the regulatory elements 
of SCR genes as described in Section 5.2. 

Genes that may be beneficially expressed in the 
roots and/or root nodules of plants include genes involved in 
nitrogen fixation or cytokines or auxins, or genes which 

30 regulate growth, or growth of roots. In addition, genes 
encoding proteins that confer on plants herbicide, salt, or 
pest resistance may bt engineered for root specific 
expression. The nutritional value of root crops may also be 
enhanced through SCR promoter driven expression of 

35 nutritional proteins. Alternatively, therapeutically useful 
proteins may be expressed specifically in root crops. 
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Genes that may be beneficially expressed in the 
stems of plants include those involved in starch lignin or 
cellulose biosynthesis. 

In accordance with the present invention, desired 
5 plants which express a heterologous gene of interest under 
the control of the SCR promoter may be engineered by 
transforming a plant cell with SCR promoter driven constructs 
using those technigues described in Section 5.2.2. and 5.3., 
supra. 



10 



5.3.4. SCREENING OF TRANSFORMED PLANTS FOR THOSE 
HAVING DESIRED AT.TR RED TRATT. ? 

It will be recognized by those skilled in the art 
that in order to obtain transgenic plants having the desired 
15 engineered traits, screening of transformed plants (i.e., 

those having an gene construct of the invention) having those 
traits may be reguired. For example, where the plants have 
been engineered for ectopic over express ion of SCR gene, 
transformed plants are examined for those expressing the SCR 

20 gene at the desired level and in the desired tissues and 

developmental stages. Where the plants have been engineered 
for suppression of the SCR gene product, transformed plants 
are examined for those expressing the SCR gene product (e.g., 
RNA or protein) at reduced levels in various tissues. The 

25 plants exhibiting the desired physiological changes, e.g., 
ectopic SCR overexpression or SCR suppression, may then be 
subse q uently screened for those plants that have the desired 
structural changes at the plant level (e.g., transgenic 
plants with overexpression or suppression of SCR gene having 

30 the desired altered root structure) . The same principle 

applies to obtaining transgenic plants having tissue-specific 
expression of a heterologous gene in embryos and/or roots by 
the use of a SCR promoter driven expression construct. 

Alternatively, the transformed plants may be 

35 directly screened for those exhibiting the desired structural 
and functional changes. In one embodiment, such screening 
may be for the size, length or pattern of the root of the 
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transformed plants. In another embodiment, the screening of 
the transformed plants may be for altered gravitropism or 
decreased susceptibility to lodging. In other embodiments, 
the screening of the transformed plants may be for improved 
5 agronomic characteristics (e.g., faster growth, greater 
vegetative or reproductive yields, or improved protein 
contents, etc.), as compared to unengineered progenitor 
plants, when cultivated under various growth conditions 
(e.g., soils or media containing different amount of 
10 nutrients, water content) . 

According to the present invention, plants 
engineered with SCR overexpression may exhibit improved 
vigorous growth characteristics when cultivated under 
conditions where large and thicker roots are advantageous. 
15 Plants engineered for SCR suppression may exhibit improved 
vigorous growth characteristics when cultivated under 
conditions where thinner roots are advantageous. 

Engineered plants and plant lines possessing such 
improved agronomic characteristics may be identified by 
20 examining any of following parameters: 1) the rate of growth, 
measured in terms of rate of increase in fresh or dry weight; 
2) vegetative yield of the mature plant, in terms of fresh or 
dry weight; 3) the seed or fruit yield; 4) the seed or fruit 
weight; 5) the total nitrogen content of the plant; 6) the 
25 total nitrogen content of the fruit or seed; 7) the free 
amino acid content of the plant; 8) the free amino acid 
content of the fruit or seed; 9) the total protein content of 
the plant; and 10) the total protein content of the fruit or 
seed. The procedures and methods for examining these 
30 parameters are well known to those skilled in the art. 

According to the present invention, a desired plant 
is one that exhibits improvement over the control plant 
(i.e., progenitor plant) in one or more of the aforementioned 
parameters. In an embodiment, a desired plant is one that 
35 shows at least 5% increase over the control plant in at least 
one parameter. In a preferred embodiment, a desired plant is 
one that shows at least 20% increase over the control plant 
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in at least: one parameter. Most preferred is a plant that 
shows at least 50% increase in at least one parameter. 

6. EXAMPLE 1; ARABI DOPSIS SCR GRKE 
5 This example describes the cloning and structure of 

the Arabidopsis SCR gene and its expression. The deduced 
amino acid sequence of the Arabidopsis SCR gene product 
contains a number of potential functional domains similar to 
those found in transcription factors. Closely related 

10 sequences have been found in both dicots and monocots 

indicating that Arabidopsis SCR is a member of a new protein 
family. The expression pattern of the SCR gene was 
characterized by means of in situ hybridization and by an 
enhancer trap insertion upstream of the SCR gene (described 

15 in more detail in Section 7). The expression pattern is 

consistent with a key role for Arabidopsis SCR in regulating 
the asymmetric division of the cortex/endodermis initial 
which is essential for generating the radial organization of 
the root. 

20 

6-1- MATERIALS AND METHODS 

6.1.1. PLANT CULTURE 
Arabidopsis ecotypes Wassilewskija (Ws) , Columbia 
(Col) , and Landsberg erecta (Ler) were obtained from Lehle. 
25 Arabidopsis seeds were surface sterilized and grown as 
described previously (Benfey et al., 1993, Development 
119:57-70). Generation of the enhancer trap lines is 
described in Section 7. 



3© 6.1.2. GENETIC ANALYSTS 

For the scr-l allele, co-segregation of the mutant 
phenotype and kanamycin resistance conferred by the inserted 
T-DNA was determined as described previously (Aeschbacher et 
al., 1995, Genes & Development 9:330-340). Because kanamycin 

35 affects root growth, 1557 seeds from heterozygous lines were 
germinated on non-selective media, scored for the appearance 
of the mutant phenotype, and subsequently transferred to 
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selective media. All (284) phenotypically mutant seedlings 
showed resistance to the antibiotic, whereas 834 of 1273 
phenotypically wild-type seedlings showed resistance to 
kanamycin, respectively. Phenotypically wild type plants 
5 (83) were also transferred to soil and allowed to set seeds. 
The progeny of these plants were plated on selective and non- 
selective media, and scored for the co-segregation of the 
mutant phenotype and antibiotic resistance. A majority (48) 
of the plants segregated for the mutant phenotype and for 

10 kanamycin resistance, whereas 35 were wild-type and sensitive 
to kanamycin. Due to a mis-identified cross, scr-2 was 
originally thought to be non-allelic and was named pinocchio 
(Scheres et al., 1995, Development 121:53-62). Subsequent 
mapping results placed it in an identical chromosomal 

15 location as scr-2. The original scr-2 line contained at 

least two T-DNA inserts. Co-segregation analysis revealed a 
lack of linkage between the antibiotic resistance marker 
carried by the T-DNA and the mutant phenotype. Antibiotic 
sensitive lines were identified that segregated for mutants. 

20 These lines were crossed to scr-l. All Fl antibiotic 
resistant progeny exhibited a mutant phenotype. All F2 
progeny (from independent lines) were mutant, and there was a 
3:1 segregation for antibiotic resistance indicating that the 
two mutations were allelic. Antibiotic sensitive lines of 

25 scr-2 were found to contain a rearranged T-DNA insert as 
determined by Southern blots and PCR using T-DNA specific 
probes and primers respectively. The presence of this T-DNA 
in the SCR gene was confirmed by Southern blots using SCR 
probes. A combination of T-DNA and SCR specific primers was 
30 used to amplify T-DNA/ SCR junctions. The PCR fragments were 
cloned using the TA cloning kit (Invitrogen) and sequenced. 
The insertion points were determined for both 5' and 3' T- 
DNA/SCR junctions. 
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6.1.3. MAPPING 

Mutant plants of scr-2 (Ws background) were crossed 
to Col WT. DNA from mutant F2 individual plants were 
analyzed for co-segregation with microsatellite (Bell & 
5 Ecker, 1994, Genomics 18:137-144) and CAPS markers (Konieczny 
& Ausubel, 1993, Plant J. 4:403-410). The closest linkage 
was found to two CAPS markers located at the bottom of 
chromosome III. Only one out of 238 mutant chromosomes was 
recombinant for the BGLl marker (Konieczny & Ausubel, 1993, 

10 Plant J. 4:403-410) and one out of 210 chromosomes was 
recombinant for the cdc2b marker. 

A RFLP for the SCR gene was identified between Col 
and Ler ecotypes with Xho I endonuclease . Genomic DNAs from 
independent Rl lines (Jarvis et al., 1994, Plant Mol. Biol. 

15 24:685-687) were digested with Xho I and blots were 

hybridized to SCR. Using the segregation data obtained for 
25 Rl lines, the SCR gene was mapped relative to molecular 
markers by CLUSTER. The SCR gene was assigned to the bottom 
of chromosome III closest to BGLl. 

20 

6-1.4. PHENOTYPI C ANALYSIS 

Morphological characterization of the mutant roots 
was performed as follows: 7 to 14 days post-germination 
phenotypically mutant seedlings were fixed in 4.0% 

25 formaldehyde in PIPES buffer pH 7.2. After fixation the 

samples were dehydrated in ethanol followed by infiltration 
with Historesin (Jung-Leica, Heidelberg, Germany) . Plastic 
sections were mounted on superfrost slides (Fisher) . The 
sections were either stained with 0.05% toluidine blue and 

30 photographed using Kodak 160T film or used for Casparian 
strip detection or antibody staining. 

Casparian strip detection was performed as 
described previously (Scheres et al., 1995, Development 
121:53-62), with the following modifications. Plastic 

35 sections were used and the counterstaining was done in 0.1% 
aniline blue for 5 to 15 min. The sections were visualized 
with a Leitz fluorescent microscope with FITC filter. 
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Pictures were taken using a Leitz camera attached to the 
microscope and Kodak HC400 film. Slides were digitized with 
a Nikon slide scanner and manipulated in Adobe Photoshop. 

For antibody staining , sections were blocked for 2 
5 hours at room temperature in 1% BSA in PBS containing 0.1% 
Tween 20 (PBT) . Samples were incubated with primary 
antibodies at 4° C in 1% BSA in PBT overnight, and then 
washed 3 times 5 minutes each with PBT. Samples were 
incubated for two hours with biotinylated secondary 

10 antibodies (Vector Laboratories) in PBT , and washed as above. 
Samples were incubated with Texas Red conjugated avidin D for 
2 hours at room temperature, washed as before, and mounted in 
Citifluor. Immunofluorescence was observed with a 
fluorescent microscope equipped with a Rhodamine filter. 

15 Staining with the CCRC antibodies was performed as described 
previously (Freshour et al., 1996, Plant Physiol. 110:1413- 
1429) . 

6.1.5. MOLECULAR TECHNIQUES 

20 Genomic DNA preparation was performed using the 

Elu-Quik kit (Schleicher & Schuell) protocol. Radioactive 
and non-radioactive DNA probes were labeled with either 
random primed labeling or PCR-mediated synthesis according to 
the Genius kit manual (Boehringer Mannheim). E . coli and 

25 Agrobetcterlum tumefaciens cells were transformed using a BIO- 
RAD gene pulser. Plasmid DNA was purified using the alkaline 
lysis method (Maniatis et al. , Molecular Cloning: A 
Laboratory Manual, Cold Spring Harbor, New York: Cold Spring 
Harbor Laboratory, 1982). 

30 A probe made from a rescued fragment of 1.2 kb was 

used to screen a wild-type genomic library made from WS 
plants. One genomic clone containing an insert of 
approximately 23 kb was isolated. A 3.0 kb Sac I fragment 
from the genomic clone, which hybridized to the 1.2 kb probe , 
35 was subcloned and sequenced (FIG. 5A) . Comparison of the 

nucleotide sequence between the genomic clone and the rescued 
plasmid revealed the site of the T-DNA insertion. 
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Approximately 600,000 plaques from a cDNA library, obtained 
from inflorescences and siliques (Col ecotype) , and therefore 
enriched in embryos, were screened with the 1.2 kb probe. 
Four cONA clones were isolated. The dideoxy sequencing 
5 method was performed using the Sequenase kit (United States 
Biochemical Corp.). Sequence-specific internal primers were 
synthesized and used to sequence the Sac I genomic as well 
the cDNA clones. Total RNA from plant tissues was obtained 
using phenol/chloroform extractions as described in (Berry et 
10 al., 1985, Mol. Cell. Biol. 5:2238-2246) with minor 

modifications. Northern hybridization and detection were 
performed according to the Genius kit manual (Boehringer 
Mannheim) . 

To identify the site of insertion of the enhancer- 
15 trap T-DNA, genomic DNA from ET199 homozygous plants was 
amplified using primers specific for the T-DNA left border 
and the SCR gene. An approximately 2.0 kb fragment was 
amplified. This fragment was sequenced and the site of 
insertion was found to be approximately 1 kb from the ATG 
20 start codon. 



6.1.6. IN SITU HYBRIDT ZATTOM 

Antisense and sense SCR riboprobes were labeled 
with digoxigenin-ll-UTP (Boehringer Mannheim) using T7 
25 polymerase following the manufacturer's protocol. Probes 
contained a 1.1 kb 3' portion of the cDNA. Probe 
purification, hydrolysis and quantification were performed as 
described in the Boehringer Mannheim Genius System user's 
guide. 

30 Tissue samples were fixed in 4 % formaldehyde 

overnight at 4«C and rinsed two times in PBS (Jackson et al., 
1991, pi. cell 3:115-125). They were subsequently pre- 
embedded in 1 % agarose in PBS. The fixed tissue was 
dehydrated in ethanol, cleared in Hemo-De (Fisher Scientific, 

35 Pittsburgh, pa) and embedded in ParaplastPlus (Fisher 

scientific) . Tissue sections (lOjxm thick) were mounted on 
SuperfrostPlus slides (Fisher Scientific) . Section 
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pretreatment and hybridization were performed according to 
(Lincoln et al., 1994 , Plant Cell 6:1859-1876) except that 
proteinase K was used at 30 mg/ml and a two hour 
prehybridization step was included. Probe concentration of 
5 50 ng/ml/kb was used in the hybridization. 

Slides were washed and the immunological detection 
was performed according to (Coen et al., 1990, Cell 63:1311- 
1322) with the following modifications. Slides were first 
washed 5 h in 5xSSC, 50% formamide. After RNase treatment 

10 slides were rinsed three times (20 min each) in the buffer 
(0.5 M NaCl, 10 mM Tris-HCl pH 8.0, 5.0 mM EDTA) . In the 
immunological detection, antibody was diluted 1:1000, 
levamisole (240 ng/ml) was included in the detection buffer, 
and after stopping the reaction in 10 mM Tris, 1 mM EDTA, 

15 sections were mounted directly to Aqua -Poly /Mount 
(Polysciences, Warrington, PA) . 

6.2. RESULTS 

6.2.1. CHARACTERIZATION OF THE SCR PHENOTYPE 

20 The scarecrow mutant scr-1 was isolated in a screen 

of T-DNA transformed Arabidopsis lines (Feldmann, K.A. , 1991, 
Plant J. 1:71-82), as a seedling with greatly reduced root 
length compared to wild-type (Scheres et al., 1995, 
Development 121:53-62). A second mutant scr-2 with a similar 

25 phenotype was subsequently identified among T-DNA transformed 
lines. Analysis of co-segregation between the mutant 
phenotype and antibiotic resistance carried by the T-DNA 
indicated tight linkage for scr-1 and no linkage for scr-2 
(see Experimental Procedures) . An antibiotic sensitive line 

30 of scr-2 was isolated and crossed with scr-1. The F2 progeny 
of this cross were all mutant and segregated 3:1 for 
antibiotic resistance confirming allelism (see Materials & 
Methods) . The principal phenotypic difference between the 
two alleles was that scr-1 root growth was more retarded than 
35 that of scr-2, suggesting that it is the stronger allele 
(FIG. 2A) . For both alleles the aerial organs appeared 
similar to wild-type and the flowers were fertile (FIGS. 2A 
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and 2B) . The progeny of backcrosses of scr-l or scr-2 to 
wild-type plants segregated 3:1 for the root phenotype for 
both alleles, indicating that each nutation is monogenic and 
recessive. 

S Analysis of transverse sections through the primary 

root of seedlings revealed only a single cell layer between 
the epidermis and the pericycle (PIG. 2C) instead of the 
normal radial organization consisting of cortex and 
endodermis (FIG. 2D) . This radial organization defect was 
10 not limited to the primary root, but was also present in 

secondary roots (FIG. 2E) and in roots regenerated from calli 
(FIG. 2F) . occasionally defects were observed in the number 
of cells in the remaining cell layer (more than the invariant 
8 found in wild-type) . Abnormal placement or numbers of 
15 epidermal cells were also observed (see FIG. 2E) . These 

abnormalities were more frequently observed in scr-l than in 
scr-2. Nevertheless, organization of the mutant root closely 
resembles that of wild-type except for the consistent 
reduction in the number of cell layers. Because the 
20 endodermis and cortex are normally generated by an asymmetric 
division of the cortex/ endodermal initial, this indicates 
that the primary defect in scr is disruption of this 
asymmetric division. 

It has been shown that the radial organization 
25 defect in scr-l first appears in the developing embryo at the 
early torpedo stage and manifests itself as a failure of the 
embryonic ground tissue to undergo the asymmetric division 
into cortex and endodermis (Scheres et al. , 1995, Development 
121:53-62). This defect extends the length of the embryonic 
30 axis which encompasses the embryonic root and hypocotyl. 

Other embryonic tissues appear similar to wild-type (Scheres 
et al., 1995, Development 121:53-62). In seedling hypocotyls 
of the scarecrow phenotype, two cell layers instead of the 
normal three layers (two cortex and one endodermis) between 
35 epidermis and stele were found. This would be the expected 
result of the lack of the division of the embryonic ground 
tissue. Similar results were obtained for scr-2. Hence, 
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this mutant: identifies a gene involved in the asymmetric 
division that produces cortex and endodermis from ground 
tissue in the embryonic root and hypocotyl and from the 
cortex/ endodermal initials in primary and secondary roots. 

6 .2. 2. CHARACTERIZATION OF CELL IDENTITY IN SCR 
ROOTS 

To understand the role of the Arabidopsis SCR gene 
in regulating this asymmetric division, it was necessary to 
determine the identity of the mutant cell layer. Tissue- 
specific markers were used to distinguish between several 
possibilities. The cell layer could have differentiated 
attributes of either cortex or endodermis. Alternatively, it 
could have an undifferentiated, initial-cell identity or it 
could have a chimeric identity with differentiated attributes 
of both endodermis and cortex in the same cell. 

Transverse sections of scr-1 and scr-2 roots were 
assayed for the presence of tissue-specific markers. The 
casparian strip, a deposition of suberin between radial cell 
walls, is specific to the endodermal cells and is believed to 
act as a barrier to the entry of solutes into the vasculature 
(Esau, K. Anatomy of Seed Plants, New York: John Wiley & 
Sons, 1977, Ed. 2, pp. 1-550). Histochemical staining 
revealed the presence of a casparian strip in the mutant cell 
layer (FIG. 3A, compare to wild-type, FIG. 3B) . It is noted 
that in the vascular cylinder, this histochemical stain also 
reveals the presence of lignin, indicating the presence of 
differentiated xylem cells in mutant (FIG. 3A) and wild-type 
(FIG. 3B) . Another marker of the differentiated endodermis 
is the arabinogalactan epitope recognized by the monoclonal 
antibody, JIM13 (Knox et al., 1990, Planta 181:512-521). The 
mutant cell layer showed staining with this antibody 
(FIG. 3C, compare with wild-type, FIG. 3B) . As a positive 
control, the JIM7 antibody that recognizes pectin epitopes in 
all cell walls was used (FIGS. 3E and 3F) . These results 
indicate that the cell layer between the epidermis and the 
pericycle has differentiated attributes of the endodermis. 
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As a marker for the cortex, the CCRC-M2 monoclonal 
antibody was used. This antibody recognizes a cell wall 
oligosaccharide epitope, found only on differentiated cortex 
and epidermis cells. In sections from the differentiation 
5 zone of scr-2 and scr-2 , both cortex and epidermal cells 
showed staining (FIG. 4A and 4B) that was similar to that of 
wild-type (FIG. 4C) . In scr-2, staining of both cell types 
was apparent, but staining of cortex was somewhat weaker than 
wild-type. The positive control used the CCRC-M1 monoclonal 

10 antibody which recognizes an oligosaccharide epitope found on 
all cells (FIGS. 4D-F) . 

With the CCRC-M2 antibody an interesting difference 
was observed between the staining pattern of the mutants as 
compared to wild-type. The appearance of this epitope 

15 correlates with differentiation in these two cell types. 
Normally, in sections close to the root tip there is no 
staining, in sections higher up in the root, atrichoblasts 
(epidermal cells that do not make root hairs) stain. In 
sections from more mature root tissue, all epidermal cells as 

20 well as cortex cells stain for this epitope. in both scr-1 
and scr-2, sections could be found in which all epidermal 
cells stained while there was little detectable staining of 
cortex cells. Although not precisely identical to the wild- 
type staining pattern, the fact that the mutant cell layer 

25 clearly stains for this cortex marker indicates that there 
are cortex differentiated attributes expressed in these 
cells. 

Taken together, these results indicate that the 
mutant cell layer has differentiated attributes of both the 

30 endodermis and cortex. The possibility that there has been a 
simple deletion of a cell type, or that the resulting cell 
type remains in an undifferentiated initial-like stage can be 
ruled out. This result is consistent with a role for the scr 
gene in regulating this asymmetric division rather than a 

35 role in directing cell specification. 
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6.2-3. MOLECULAR CLONING OF THE SCR GENE 

To further elucidate the function of the 
Arabidopsis SCR gene the inserted T-DNA sequences were used 
to clone the gene. Plant DNA flanking the insertion site was 
5 obtained from scr-2 by plasmid rescue and used to isolate the 
corresponding wild- type genomic DNA. Several cDNA clones 
were isolated from a library made from silique tissue. 
Comparison of the sequence of the longest cDNA and the 
corresponding genomic region revealed an open reading frame 

10 (ORF) interrupted by a single small intron. (FIG. 5A) . A 
potential TATA box and polyadenylation signal that matched 
the consensus sequences for plant genes were also identified 
(Joshi, CP., 1987, Nucl. Acids Res. 15:6643-6653); Heidecker 
& Messing, 1986, Ann. Rev. Pl<ant Physiol. 37:439-466); Mogen 

15 et al., 1990, Plant Cell 2:1261-1272). 

Comparison of the nucleotide sequence between the 
genomic clone and the rescued plasmid placed the site of the 
T-DNA insertion in scr-1 at codon 470 (FIGS. 5 A and 5B) . For 
scr-2, although no linkage was found between the mutant 

20 phenotype and antibiotic resistance, DNA blot and PCR 

analysis of antibiotic sensitive lines revealed the presence 
of T-DNA sequences that co-segregated with the mutant 
phenotype. The insertion position in scr-2 was determined by 
cloning and sequencing the PCR products amplified from its 

25 genomic DNA using a combination of T-DNA and SCR specific 
primers at both sides of the insertion (FIG. 5B) . In scr-2 
the T-DNA insertion point is at codon 605 (FIG. 5A and 5B) . 
To verify linkage between the cloned gene and the mutant 
phenotype , we identified the chromosomal location of both the 

30 scr locus and the SCR gene. To map the scr locus, molecular 
markers were used on F2 progeny of crosses between scr-2 
(ecotype Wassilewskija, Ws) and Colombia (Col) WT. These 
placed the scr locus at the bottom of chromosome III, 
approximately 0.5 cM away from each of the two closest 

35 markers available, cdc2b and BGL1 (Konieczny and Ausubel, 
1993, Plant J. 4:403-410). To map the SCR gene, we 
identified a polymorphism between Col and Landsberg (Ler) 
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ecotypes using the SCR probe b (FIG. 5B) • Southern analysis 
of 25 recombinant inbred lines (Jarvis et al. r 1994 , Plant 
Mol. Biol, 24:685-687) mapped the cloned gene to the same 
location as the SCR locus on chromosome III. 
5 The determination of the molecular defects in two 

independent alleles and the co-localization of the cloned 
gene and the mutant locus confirms that we have identified 
the SCR gene. 



10 6.2.4. THE SCR GENE HAS MOTIFS THAT INDICATE IT 

IS A TRANSCRIPTION FACTO* 

The Arabidopsis SCR gene product is a 653 amino 
acid polypeptide that contains several domains (FIG. 5B) . 
The amino-terminus has homopolymeric stretches of glutamine, 
15 serine, threonine, and proline residues, which account for 
44% of the first 267 residues. Domains rich in these 
residues have been shown to activate transcription and may 
serve such a role in SCR (Johnson et al., 1993, J. Nutr. 
Biochem 4:386-398). A charged region between residues 265 
20 and 283 has similarity to the basic domain of the bZIP family 
of transcriptional regulatory proteins (FIG- 5C) (Hurst, 
H.C., 1994, Protein Profile 1:123-168). The basic domains 
from several bZIP proteins have been shown to act as nuclear 
localization signals (Varagona et al., 1992, Plant Cell 
25 4:1213-1227), and this region in SCR may act similarly. This 
charged region is followed by a leucine heptad repeat 
(residues 291-322). A second leucine heptad repeat is found 
toward the carboxy-terminus (residues 436 to 473). As 
leucine heptad repeats have been demonstrated to mediate 
30 protein-protein interactions in other proteins (Hurst, H.C., 
1994, Protein Profile 1:123-168), the existence of these 
motifs suggests that SCR may function as a dimer or a 
multimer. The second leucine heptad repeat is followed by a 
small region rich in acidic residues, also present in a 
35 number of defined transcriptional activation domains (Johnson 
et al., 1993, J. Nutr Biochem 4:386-398). While each of 
these domains has been found within proteins that do not act 



- 65 - 



WO 97/41152 



PCTYUS97/07022 



as transcriptional regulators, the fact that all of them are 
found within the deduced SCR protein sequence indicates that 
SCR is a transcriptional regulatory protein. 

5 6.2.5. SCR IS A MEMBER OF A NOVEL PR OTEIN FAMILY 

The Arabidopsis SCR protein sequence was compared 
with the sequences in the available databases. Eleven 
expressed sequence tags (ESTs) , nine from Arabidopsis, one 
from rice and one from maize, showed significant similarity 

10 to residues 394 to 435 of the SCR sequence, a region 
immediately amino -terminal to the second leucine heptad 
repeat (FIGS. 15K-L) . This region is designated the VHIID 
domain. Subsequent analysis of these EST sequences has 
revealed that the sequence similarity extends beyond this 

15 region; in fact, the similarity extends throughout the entire 
known gene products. The combination and order of the motifs 
found in these sequences do not show significant similarity 
to the general structures of other established regulatory 
protein families (i.e., bZIP, zinc finger, MADS-domain, and 

20 homeodomain) , indicating that the SCR proteins comprise a 
novel family. 



6.2.6. SCR IS EXPRESSED IN THE CORTEX / ENDODERMAL 
INITIALS AND IN THE ENDODERMIS 

RNA blot analysis revealed expression of SCR in 

Arabidopsis siliques, leaves and roots of wild-type plants 

(FIG. 6A) . No hybridization was detected to RNA from scr-2 

plants (FIG. 6B, lane 2) . This indicates that scr-1 has a 

reduced level of RNA expression and may represent the null 

phenotype. Hybridization to RNA species larger than the 

normal size were detected in scr-2 . This indicates that 

abnormal SCR transcripts are made in this allele, suggesting 

that functional but possibly altered proteins may be 

produced • 

To determine if expression was localized to any 
particular cell type, RNA in situ was hybridization performed 
on sections of root tissue. In mature roots, expression was 
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localized primarily to the endodenais (FIGS. 7A and 7B) . 
Expression appeared to start very close to or within the 
cortex/ endodermal initials and continue up the endodermal 
cell file as far as the section extended. Expression was 
5 also detected in late-torpedo stage embryos in the endodermis 
throughout the embryonic axis (FIG. 7C) . Sense strand 
controls showed only background hybridization (FIG. 7D) . 

To determine whether the localization of SCR RNA 
was regulated at the transcriptional or post-transcriptional 
10 level, enhancer trap (ET) lines were prepared and examined in 
which the ^-glucuronidase (uid-A or GUS) coding seguence with 
a minimal promoter was expressed in the root endodermis. 
(See Section 7, infra). Restriction fragment length 
polymorphisms were observed when DNA from one of these lines, 
15 ET199 and wild-type were probed with SCR. PCR and seguence 
analysis confirmed that the enhancer-trap construct had 
inserted approximately 1 kb upstream of the SCR start site 
and in the same orientation as that of SCR transcription. 

In mature roots, expression in ET199 whole mounts 
20 showed a similar pattern to that of the in situ 

hybridizations, with the strongest staining present in 
endodermal cells (FIG. 7E) . Transverse sections indicated 
that expression was primarily in endodermal cells in the 
elongation zone (FIG. 7F) . Longitudinal sections through the 
25 meristematic zone revealed that expression could be detected 
in the cortex/ endodermal initial (FIG. 7G) . of particular 
interest was the restriction of expression to the endodermal 
daughter cell after the periclinal division (FIG. 7G) . This 
indicated that the expression pattern observed in the in situ 
30 analysis was not due to post-transcriptional partitioning of 
SCR RNA. Rather, it suggests that after the periclinal 
division of the cortex /endodermis initial only one of the two 
cells is able to transcribe SCR RNA. 
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6-3. DISCUSSION 

6.3.1. THE SCR GENE REGULATES AN ASYMMETRIC 
DIVISION REQUIRED FOR ROOT RADIAL 
ORGANIZATION 

5 The formation of the cortex and endodermal layers 

in the Arabidopsis root requires two asymmetric divisions. 
In the first, an anticlinal division of the cortex /endodermal 
initial generates two cells with different developmental 
potentials. One will continue to function as an initial, 

1Q while the other undergoes a periclinal division to generate 
the first cells in the endodermal and cortex cell files. 
This second asymmetric division is eliminated in the 
scarecrow mutant, resulting in a single cell layer instead of 
two. The scr mutation appears to have little effect on any 

15 other cell divisions in the root indicating that it is 

involved in regulating a single asymmetric division in this 
organ. Several other mutations have been characterized that 
appear to affect specific cell division pathways in 
Arabidopsis. These include knolle (kn) in which formation of 

20 the epidermis is impaired (Lukowitz et al. , 1996, Cell 84:61- 
71) , wooden legr (wol) in which vascular cell division is 
defective (Scheres et al. , 1995, Development 121:53-62) and 
fstss (fs) in which there are supernumerary cortex and 
vascular cells (Scheres et al., 1995, Development 121:53-62); 

25 Torres Ruiz & Jurgens, 1994, Development 120:2967-2978). 

Only in the case of scr and short-root (shr) mutants has it 
been shown that the defect is in a specific asymmetric 
division. 

Mutational analyses in several organisms have 
30 revealed that the genes that regulate asymmetric divisions 
can be specific to a single type of division or can affect 
divisions that are not clonally related (Horvitz & 
Herskowitz, 1992, Cell 68:237-255). In most cases, these 
mutations result in the formation of two identical daughter 
35 cells with similar developmental potentials (Horvitz & 

Herskowitz, 1992, Cell 68:237-255). Both resulting cells 
have the identity of one or the other of the normal daughter 
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cells, an example of which is the swi~ nutation in S. 
cerevislae (Nasmyth et al., 1987, Cell 48:579-587). However, 
there are also examples of mutations that result in the 
formation of chimeric cell types such as the ham-l mutation 
5 in C. elegans (Desai et al., 1988, Nature 336:638-646). 

6.3.2. SCR INVOLVEMENT IN CELL 

SPECIFICATION OR CET,T, nTVjfnnp 

Genes that regulate asymmetric cell divisions can 
10 be divided into those that specify the differentiated fates 
of the daughter cells and those that function to effect the 
division of the mother cell (Horvitz & Herskowitz, 1992, 
Cell, 68:237-255). The aberrant cell layer formed in the scr 
mutant has differentiated features of both endodermal and 
15 cortex cells. Thus, scr is in the rare class of asymmetric 
division mutants in which a chimeric cell type is created. 
The ability to express differentiated characteristics of 
cortex and endodermal cells implies that the differentiation 
pathways for both these cell types are intact and do not 
20 require the functional SCR gene. This indicates that SCR is 
involved primarily in regulating a specific cell division, 
and that the correct occurrence of this division can be 
unlinked from cell specification. This is in contrast to the 
shr mutant, in which the periclinal division of the 
25 cortex/endodermal initial also fails to occur and the 

resulting cell lacks endodermal markers (Benfey et al., 1993 
Development 119:57-70) and has cortex attributes. A genetic' 
analysis was used to address the function of shr and scr in 
the asymmetric division of the cortex/endodermal initial 
30 Placing mutants of each of these genes in a fs mutant 

background asked whether the supernumerary cell divisions 
characteristic of fs were sufficient to restore normal cell 
identities (Scheres et al., 1995, Development 121:53-62). m 
the shr,£s double mutant there were additional cell layers 
35 but no endodermal, indicating that the SHR gene has a role in 
specifying cell identity. In the scr,fs double mutant no 
alteration in cell identity was observed as compared to fs 
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(Scheres et al. , 1995, Development 121:53-62). Taken 
together with the cell marker analysis presented herein, 
these results are consistent with a role for SCR in 
generating the division of the mother cell while the SHR gene 
5 may be involved in specifying the fate of the endodermal 
daughter . 

6.3.3. A ROLE FOR SCR IN EMBRYONIC DEVELOPMENT 
At least one additional cell division appears to be 
10 affected in the scr mutant. During embryonic development, 
the ground tissue does not divide to form the endodermal and 
cortex layers of the embryonic root and hypocotyl. As shown 
herein, expression of SCR was detected in the endodermal 
tissue throughout the embryonic axis shortly after this 
15 division occurs. Thus, SCR may play a direct role in 
regulating both this division and the division of the 
cortex /endodermal initial in the root apical meristem. 
Alternatively, the radial organization established in the 
embryo may somehow act as a template that directs the 
20 division of the cortex/ endodermal initial, thus perpetuating 
the pattern. This is consistent with the finding in the scr 
mutant that the aberrant pattern established in the embryo is 
perpetuated in the primary root. It is also consistent with 
a recent study in which the daughter cells of the 
25 cortex/ endodermal initial were laser ablated (van den Berg et 
al., 1995, Nature 378:62-65). When a single daughter cell 
was ablated, it was replaced by a cell that followed the 
normal asymmetric division pattern. When three adjacent 
daughter cells were ablated, the central initial divided 
30 anticlinally but failed to perform the periclinal division 

(van den Berg et al., 1995, Nature 378:62-65). This provided 
evidence that information from mature cells is required for 
the correct division pattern of cortex /endodermal initials 
suggesting a "top down" transfer of information. However, 
35 the absence of a cell layer in lateral roots and callus- 
derived roots of the scr mutant suggests that embryo events 
are not unique in their ability to establish radial 
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organization. Rather, these observations implicate SCR in 
regulating both embryonic and post-embryonic root radial 
organization. 

5 6.3.4. TISSUE— SPECIFIC EXPRESSION OF SCR IS 

REGULATED AT THE TRANS CRIPTIONAL LEVEL 

Although not intending to be limited to any theory 
or explanation regarding the mechanism of SCR action, the 
cloning of the gene and the expression pattern provide some 
10 clues as to the role of SCR in the regulation of a specific 
asymmetric division. The SCR gene is expressed in the 
cortex/endodermal initial, but immediately after division is 
restricted to the endodermal lineage. A similar pattern is 
seen in the ET199 enhancer trap line in which SCR regulatory 

15 elements are in proximity to a GUS gene, indicating that SCR 
restriction to the endodermal cell file is due to 
differential regulation of expression of the SCR gene in this 
cell and the first cell in the cortex file. Another marker 
line in which expression of GUS is detected only in the 

2 0 cortex daughter cell provides a control for differential 

degradation of GUS RNA or protein. Thus, partitioning of SCR 
RNA as a means of achieving this segregation of expression 
can be ruled out. What remains to be determined is whether 
this difference in transcriptional activity of the two 

25 daughter cells is due to internal polarity of the mother cell 
prior to division such that cytoplasmic determinants are 
unegually distributed, or to external polarity that 
influences cell fate after division. Since SCR is expressed 
prior to cell division, an attractive hypothesis is that it 

30 is involved in establishing polarity in the cortex/endodermal 
initial. The sequence of the SCR protein strongly suggests 
that it acts as a transcription factor. Hence, it may act to 
regulate the expression of other genes essential for the 
establishment of unequal division. Alternatively, it is 

35 conceivable that it could play a role in creating an external 
polarity that provides a signal to divide asymmetrically. 
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Its expression in more mature endodermal cells is consistent 
with a role in u top-down" signaling. 

6.3.5. A NEW FAMILY OF TRANSCRIPTIONAL REGULATORS 
5 Analysis of eighteen EST clones found in the 

GenBank database reveals that the proteins they encode share 
a high degree of homology with Arabidopsis SCR protein. See 
Table 1 and FIGS. 15A-S. Further sequence analysis of the 
encoded proteins indicate that a high degree of sequence 

10 similarity extends from at least the highly conserved VHIID 
domain to the carboxy-terminus of the gene products. 
Comparison of the amino termini of these proteins is 
precluded by the fact that the ESTs are incomplete. The high 
degree of similarity among these proteins, in combination 

15 with the motifs observed in the SCR protein (homopolymeric 
motifs, two leucine heptad repeats and a bZIP-like basic 
domain that may also function as a nuclear localization 
sequence) indicates that these proteins form a novel class of 
regulatory proteins. 

20 The insertion sites of the T-DNA in the two scr 

mutant alleles raised the possibility that the mutant 
phenotype was due to the production of truncated proteins. 
Northern blot analysis indicated SCR RNA is undetectable in 
scr-1. This suggests that the phenotype is either the null, 

2 5 or due to highly reduced RNA expression. In scr-2, an 

alteration in RNA size was detected which would be consistent 
with the presence of a functional and possibly truncated 
protein. This could provide an explanation for the 
observation that scr-2 appears to be the weaker allele. 

30 

7. EXAMPLE 2: ENHANCER TRAP ANALYSIS O F ROOT DEVELOPMENT 
An enhancer trap system was used in order to 
provide a more detailed molecular analysis of gene expression 
in lateral root patterning and development in Arabidopsis 
35 thalietna. A new collection of marker lines that express /3- 
glucuronidase (GUS) activity in a cell-type specific manner 
in each of the cells of the root was generated. These lines 
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allow differentiation of cells to be monitored based on 
molecular characteristics. One of these marker lines, ET199, 
resulted from the integration of the GUS cassette in 
proximity to an SCR enhancer. The results described below 
5 demonstrate that transcriptional activation of the SCR gene 
plays an important role in root development in Arabidopsis, 
and that SCR gene transcriptional regulatory elements can 
express a transgene in a developmentally and tissue specific 



manner . 



10 



7.1. MATERIALS AND METHODS 

7.1.1. PLANT GROWTH CONpTTJOttg. 

Arabidopsis seeds from NO-O and Columbia ecotypes 
were sterilized and sown on MS plates containing 4.5% 
15 sucrose. Plates were oriented vertically and maintained 
under 18 hours light, 6 hours dark cycle. 

7.1.2. HISTOLOG Y AND £US STATNTHG? 

For observation of lateral roots, roots were 
20 removed from plates and infiltrated in 25% glycerol for 

several hours to overnight. Roots were then mounted in 50% 
glycerol. Whole seedlings were stained for GUS activity for 
up to three days in the following solution: IX GUS buffer, 
20% methanol, 0.5 mg/ml X-Glu. Addition of methanol greatly 
2S improves the specificity and reproducibility of staining. 
Staining solution was made fresh from a 10X buffer (l M Tris 
PH7.5, 290 mg Naci, 66 mg K 3 Fe(CN) 5 ) that was stored for no 
more than one week. Stained roots were cleared in glycerol 
and mounted as above. All samples were observed using 
30 Nomarski optics on a Leitz Laborlux s microscope. 

Photographs were taken using a Leitz MPS52 camera, and images 
were scanned into Adobe Photoshop to create figures. m some 
cases the intensity of the blue color was increased. 
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7.1.3. CONSTRUCTI ON OF ENHANCER TRAP LINES : 
Plant Cloning Vector (PCV) (Koncz et al. , 1994, 
Specialized vectors for gene tagging and expression studies, 
in Plant Molecular p ^olocrv Manual. Gelvin & Schilperoort , 
5 eds., Vol. B2, pp. 1-2, Kluover Academic Press, Dordrecht, 
The Netherlands) contains a Bam HI site immediately adjacent 
to the T-DNA right border sequence. The ^-glucuronidase gene 
fused to the TATA region (-46 to 78) of the CaMV 3 5S promoter 
was introduced into this site (Benfey et al., 1990, EMBO J. 
10 9:1677-1684). 350 transgenic lines were generated by 

Agrobac terlum mediated root transformation (Marton & Browse, 
1991, Plant Cell Reports 10:235-239), and 4 independent lines 
from each transformant were screened for GUS activity in the 
root. 

15 

7.2. RESULTS 

7.2.1. DIFFERENTIATION IN THE LRP 

The marker lines described above reflect patterns 
of gene expression that are specific to individual root cell 

20 types. There are no readily apparent mutant phenotypes in 
any of these lines. Therefore, they can be used to analyze 
the differentiation state of the cells during normal 
development of the lateral root primordial (LRP) . If there 
are stages at which the pericycle cells proliferate in the 

25 absence of patterning, it can be expected that all cells 
would be identical with none expressing differentiated 
characteristics. In contrast, organization of the LRP would 
be reflected in differential patterns of GUS gene expression, 
with certain cells beginning to turn on transcription from 

30 differentiated cell-type specific promoters (i.e., those that 
drive GUS expression in the enhancer trap lines) . 

The process of lateral root formation is divided 
into the following seven stages: 
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Stage I: The LRP is first visible as a set of pericycle 
cells that are clearly shorter in length than their 
neighbors, having undergone a series of anticlinal divisions. 
Laskowski et al., 1995, Dev. 121:3303-3310 predict that there 
5 are approximately 4 founder pericycle cells involved. In the 
longitudinal plane, these divisions result in the formation 
of 8-io small cells, which enlarge in a radial direction. 

Stage II: A periclinal division occurs that divides the LRP 
10 into two layers (Upper Layer (UL) and Lower Layer (LL) ) . Not 
all the small pericycle-derived cells appear to participate 
in this division — typically the most peripheral cells do 
not divide. Hence, as the UL and LL cells expand radially 
the domed shape of the LRP begins to appear. 

15 

Stage III: The UL divides periclinally, generating a three 
layer primordium comprised of UL1, UL2 and LL. Again, some 
peripheral cells do not divide, creating peripheral regions 
that are one and two cell layers thick. This further 
20 emphasizes the domed shape of the LRP. 

Stage IV: The LL divides periclinally, creating a total of 
four cell layers (UL1, UL2 , LL1 , LL2 ) . At this stage the LRP 
has penetrated the parent endodermal layer. 

25 

Stage V: The central cells in LL2 undergo a number of 
divisions that push the overlying layers up and distort the 
cells in LL1. These divisions are difficult to visualize at 
this stage, but clearly form a knot of mitotic activity. The 
30 lrp at this stage is midway through the parent cortex. The 
outer layer contains 10-12 cells. 

Stage VI: This stage is characterized by several events. 
The four central cells of UL1 divide periclinally. This 
35 division is particularly useful in identifying the median 
longitudinal plane in the enlarging LRP. At this point 
there are a total of twelve cells in ULl, four in the middle 
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that have undergone the periclinal division and four on 
either side. In addition, all but the most central cells of 
UL2 undergo a periclinal. division. At this point the LRP has 
passed through the parent cortex layer and has penetrated the 
5 epidermis. The central cells apparently derived from LL2 
have a distinct elongated shape characteristic of vascular 
elements. 

Stage VII: As the primordium enlarges it becomes difficult 
10 to characterize the divisions in the internal layers. 

However, the cells in the outermost layer can still be seen 
very clearly. All of these cells undergo a anticlinal 
division, resulting in 16 central cells (8 cells in each of 
two layers) flanked by 8-10 cells on each side. We refer to 
15 this as the 8-8-8 cell pattern. The LRP appears to be just 
about to emerge from the parent root. 

7.2.2. MARKER LINES 

An enhancer trapping cassette was generated by 
20 fusing the GUS coding sequence to the minimal promoter of the 
3 5S promoter from CaMV. This minimal promoter does not 
produce a detectable level of GUS expression. However, its 
presence allows other upstream elements to direct GUS 
expression in a developmental and/or cell-specific manner 

25 (Benfey et al., 1990, EMBO J. 9:1677-1684). The use of a 
minimal promoter instead of a promoter less construct allows 
GUS expression to occur even if the enhancer trap cassette 
inserts at a distance from the coding region. Since the 
insert does not have to be within the structural gene, there 

30 are often no mutations generated in the enhancer trap lines. 
The minimal promoter: GUS construct was cloned immediately 
adjacent to the T-DNA right border seguence of PCV (Koncz et 
al., supra) and introduced into Arabidopsis. 350 independent 
lines were generated and analyzed for GUS activity in the 

35 root. The following lines most clearly define each cell 
type. All of the lines were generated through enhancer 
trapping, as described herein, below, except for CorAX92 
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(Dietrich et al., 1992, Plant Cell 4:1371-1382) and 
EpiGL2:GUS (Masucci et al., Dev. 122:1253-1260) which are 
transgenic plants that contain cell-type specific promoters 
fused to the 6US gene. 

5 

Ste05 - expresses GUS in the stele including the pericycle 
layer throughout primary and lateral roots. At the root tip, 
staining becomes weaker in the elongation zone; therefore, it 
is likely that only differentiated stele cells express GUS 
10 activity, Stelar GUS expression is also seen in aerial parts 
of the plant. 

Endl95 - expresses GUS in the endodermis of primary and 
lateral roots. Staining can be seen most clearly in the 

15 cells in the meristematic region of the root, although 

overstaining shows that more mature cells also express some 
GUS activity. It appears that there is no staining in the 
cortex/ endodermal initial, but staining is evident in the 
first daughter cell of this initial. GUS expression is also 

20 seen at the base of young leaves and in the stipules. 

ET199 - expresses GUS in the endodermis of primary and 
lateral roots, again most clearly in cells in the 
meristematic region. Unlike Endl95, staining in ET199 
25 appears to continue down to the cortex/ endodermal initial 
and, in younger roots, even into the cells of the guiescent 
center. Expression in the aerial parts of the plant is 
detectable in the young leaf primordia. 

30 CorAX92 - This line was generated by fusing the 5' and 3' 
seguences from a cortex specific gene isolated from oilseed 
rape to the GUS reporter gene (Dietrich et al., Plant Cell 
4:1371-1382). Expression is limited to the cortex layer, 
extending to but not including the cortex/endodermal initial. 

35 Staining is also apparent in the petioles and leaf blades of 
expanded leaves. 
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EpiGL2:GUS - This line was generated by fusing the GL2 
promoter to the GUS gene (Masucci et al. , Dev. 122:1253- 
1260) . Expression is seen in the non-hair forming epidermal 
cells (atrichoblasts) . Staining is seen near the root tip, 
5 but it is difficult to determine if it includes the epidermal 
initial. Staining is also seen in the trichomes, leaf 
primordia, and the epidermis of the hypocotyl and leaf 
petioles. 

10 CRC219 - This line shows staining in the columella root cap 
only. 

LRC244 - This line shows staining in the lateral root cap 
only. 

15 

RC162 - This line shows staining in both the lateral and 
columella root caps. 

Two marker lines show differential staining at 
20 very early stages of LRP development. One of these , ET199, 
presents a complex and dynamic pattern of expression. 
Staining is first apparent at stage II in only the four 
central cells of the UL. At stage III staining is strongest 
in the central cells of UL2. As the LRP reaches stage V the 
25 staining remains strongest in the central 2-4 cells of UL2. 
By stage VI staining also begins to extend into the newly 
formed endodermal layer, and staining in both the central 
cells and endodennis persists beyond emergence of the lateral 
root. 

30 Another line, LRB10 (lateral root base) , does not 

express GUS in the primary root tip. Staining in the LRP is 
seen at stage I, and at stage II all the cells of the UL and 
LL are stained. However, by stage IV and V only the cells at 
the periphery of the LRP are still expressing GUS. As the 

35 LRP develops, these cells continue to stain, although less 
intensely, resulting in a ring of GUS expressing cells at the 
base of the LR. 
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LRB10 and ET199 clearly demonstrate non-identity 
between the cells at very early stages, stage IV in the case 
of LRB10 and within the UL at stage II in ET199. in 
addition, although it is difficult to identify the nature of 
5 the cells that correspond to the observed staining pattern in 
LRB10 and the early staining cells of ET199, post-emergent 
lateral roots show analogous staining in these lines, 
suggesting that the stained cells are already expressing 
markers that reflect their differentiated cell fates. Hence, 
10 these observations suggest a very early onset of 
differentiation in the cells of the LRP. 



7.2.3. ET199 PROVIDES EVIDENCE FOR THE ROLE OF 
SCR IN PLANT DEVELOPMENT 

15 Fortuitously, it was discovered that the GUS 

cassette in ET199 described Section 7.2.2, above, is situated 
approximately 1 kb upstream from the SCR gene. The SCR cDNA 
was labelled and used to probe genomic DNA from WT and ET199 
plants. The band pattern seen in the Southern was completely 

20 consistent with a T-DNA inserted l kb upstream of the 

putative SCARECROW start site. Subseguently, a DNA fragment 
was PCR amplified using a primer within the T-DNA and a 
primer within SCARECROW. The size of this fragment was also 
consistent with the predicted insertion site. Partial 

25 seguencing of the PCR fragment confirmed the presence of 
SCARECROW sequence. Mutants in the SCR gene are completely 
lacking one of the radial layers between the epidermis and 
pericycle in both primary and lateral roots, due to the 
absence of specific cell division during embryogenesis and of 

30 the cortex/ endodermal initial during post-embryonic growth. 
The expression pattern (described in Section 7.2.2., above) 
that was observed in the central cells of the developing LRP 
of ET199 provide strong evidence that the cells in this 
region are involved in the establishment of the meristematic 

35 initials. More importantly, these results demonstrate that 
transcriptional activation of the SCR gene plays a major role 
in the development of the Arabidopsis LRP. Furthermore, 
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10 



15 



these results demonstrate that a transgene can be expressed 
under the control of SCR gene transcriptional regulatory 
elements in a developmental and tissue-specific manner. 

i 8. EXAMPLE 3: ACTIVITY OF ARABIDOPSIS SCR 

PROMOTER IN TRANSGENIC ROOTS 

The expression pattern of Arabidopsis SCR has 

been determined by analysis of an enhancer trap line, ET199, 

in which a GUS coding region with a minimal promoter was 

fortuitously inserted 1 kb upstream of the SCR coding region 

(see supra) . In ET199 plants, GUS expression is detected in 

the endodermis, endodermal initials and sometimes in the 

quiescent center (QC) of the root. See supra and Malamy and 

Benfey, 1997 , Dev. 124:33-44. This expression pattern of SCR 

in the primary root has been confirmed by in situ analysis 

(See supra and Di Laurenzio et al., 1996, Cell 86:423-433). 

The following experiments demonstrate that 2.5 kb 

of 5' sequence upstream of the Arabidopsis SCR coding region 

is sufficient to confer SCR expression pattern to a 

heterologous gene. The 5' sequence used in these studies 

starts from the Hind III site approximately 2.5 kb upstream 

of the ATG initiation site and extends 3' downstream to the 

base pair immediately upstream of the ATG initiation site 

(see FIG. 14) . This 5' sequence was fused to a GUS coding 

sequence. The resulting SCR pr omoter : : GUS construct was 

incorporate into an Agrobacterium vector, which was used to 

transform and generate transgenic roots using standard 

procedures . 

A large number of roots were regenerated. They 
3Q show GUS staining pattern that is similar to the SCR 

expression pattern in ET199 plants (Figure 19 , Panel f ) . 
Since organs regenerated from callus often have an abnormal 
morphology, transgenic roots were transferred to liquid 
culture. Roots grown in liquid culture appeared 
35 morphologically normal and showed GUS expression in the 

endodermis, endodermal initial and QC (Figure 19, Panel g) , 
similar to the expression pattern of SCR seen in the 



20 



25 
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enhancer trap line ET199. These results indicate that the 
2.5 kb region upstream of the SCR start site is sufficient to 
confer the SCR expression pattern in the root. 

The expression of the SCR promoter: :GUS construct 
5 was also examined in scr mutant background. The scr mutant 
has an altered root organization (see, supra) . Whereas the 
wild-type root of Arabidopsis has four distinct cell layers 
surrounding the vascular tissue, the roots of scr mutant have 
only three. 

10 Transgenic roots of the scr mutant were generated 

that contained a SCR promoter :: 6US construct. As in the 
wild-type, a large number of transgenic roots were formed 
that had detectable GUS expression (Figure 20, Panel a) . 
These roots were shorter than wild-type regenerated roots, 
15 consistent with the shorter root phenotype of the scr mutant. 

Additional transgenic root experiments 
demonstrated that the SCR gene under control of its own 
promoter can rescue the scr mutant phenotype. Transgenic scr 
roots were generated that contained the full length SCR gene 
20 under the control of its own promoter. The length of 

transgenic roots containing the construct were longer than 
those of the scr mutant, indicating that the introduced SCR 
gene partially rescued the mutant. Whereas scr regenerated 
roots that carried the SCR promoter: : GUS construct were very 
25 short (Figure 21, Panel a; and Figure 20, Panel a), roots 
transformed with the SCR promoter and coding region were 
noticeably longer (Figure 21, Panel b) . The difference was 
even more obvious in liquid culture, in which scr mutant 
roots remained short (Figure 21, Panel c) , while SCR gene 
30 complemented scr mutant roots were long and resembled wild- 
type roots (Figure 21, Panel d) . 

Anatomical studies of the regenerated roots 
confirmed the ability of the SCR promoter :: SCR gene construct 
to rescue the scr mutant phenotype. Whereas regenerated 
35 roots of scr mutant were missing an internal layer (Figure 
21, panel e) , the scr mutant roots that were transformed with 
the SCR promoter: :SCR gene construct had a radial 
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organization that resembled wild-type root (Figure 21 , 
Panel f ) . 

9 . EXAMPLE 4 : ISOLATION SCR SEQUENCES # USING PCR- 
5 f^NING STRATEGY 

Based on the comparison of the sequences of SCR 
paralogs in Arabidopsis, degenerate primers SCR3AII , SCR5AII 
and SCR5B were designed and used in PCR amplification of SCR 
sequences from genomic DNA of various plant species. The 
amplification was performed according to condition described 
in Section 5.1.1., supra, using DNA isolated from maize 
plants grown from a commercial seed mixture. Amplification 
products (104 bp fragment for the SCR5B+SCR3AII primer 
combination; 146 bp fragment for the SCR5AI I +SCR3 AI I primer 
combination) were obtained, and each cloned into a T/A vector 
(Invitrogen, San Diego, CA) and sequenced. Two of the three 
different types of clones obtained had deduced amino acid 
sequences that were very similar to a part of the Arabidopsis 
SCR protein (i.e., approximately 90% identity), suggesting 
2Q that they represent parts from two different alleles of the 
maize SCR gene (i.e., ZCR gene). The two clones each had 
only two conservative changes in their nucleotide sequence. 

The 146 bp amplification product, ZmScll, was 
subsequently used as a probe for screening of a genomic 
library generated in lambda BlueSTAR vector (NOVAGEN) from 
maize (Hill line) genomic DNA. The screening was performed 
according to the standard procedures described in Qqnius™ 
System User's Guide For M embrane Hybridization (Boehringer- 
Mannheim) : The probe was a single-strand DNA molecule 
3§ corresponding to the ZmScll fragment produced by PCR (Genius, 
Boehringer-Mannheim) . Hybridization was performed according 
to recommendations of the manufacturer's manual 
(Boehringer-Mannheim) . Prehybridization was for 2 hr in 50% 
formamide hybridization solution at 42*C. Hybridization was 
35 overnight at 42°C with 200 ng/ml probe concentration. 

Filters were washed twice at room temperature in 2xSSC, 0.1% 
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SDS for 5 min, and for stringent washing at 65 °C in 
0.5xSSC,0.1% SDS twice for 15 min. 

A positive clone was identified. The clone 
contained a 13 kb insert, which was subcloned into a plasmid 
5 vector. The resulting plasmid was designated pZCR. A 5 kb 
Eco RI fragment containing the maize SCR (ZCR) sequence was 
subcloned and sequenced. The nucleotide sequence of the 
region containing a partial ZCR coding sequence is shown in 
FIG. 17A and the corresponding deduced amino acid sequence is 

10 shown in FIG. 17B. The ZCR protein contain a segment that is 
highly homologous to a corresponding segment in the 
Arabidopsis SCR protein (FIG. 17B) . This segment is flanked 
by segments of low homology. Thus, it is possible that the 
genomic clone of ZCR is a composite clone , containing 

15 sequences that are not ZCR sequences. 

The deduced ZCR protein sequence was aligned with 
that of Arabidopsis SCR protein. The comparison revealed new 
conserved sites in the SCR coding sequence which were used to 
design new, more specific PCR primers (i.e., IF, 1R, and 4R) 

20 for use in amplification of SCR sequences from yet other 
plant species. 

Using combinations of primers 1F+1R and 1F+4R, 
PCR amplification were performed as described in section 
5.1.1.. Two DNA of expected size were obtain from soybean: 

25 a 247 bp DNA from the 1F+1R primer combination and a 379 bp 
DNA from the 1F+4R primer combination. A DNA of expected 
size (247 kb) was obtained from carrot and spruce when their 
genomic DNA was amplified using 1F+4R primer combination. 
The nucleotide sequences of the 379 kb soybean DNA (SRPgl) , 

30 the 247 kb DNA from carrot (SRPdl) and spruce (SRPpl) are 
shown in FIGS. 16K-M. The corresponding deduced amino acid 
sequences of these amplified sequences are shown in FIG. 18. 
comparison of these partial SCR coding sequences indicate 
this approach isolated DNA sequences that encode SCR proteins 

35 with amino acid sequences that are very similar but not 

identical to a segment of Arabidopsis SCR protein (see FIG. 
18). 
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10. EXAMPLE 5. EXPRESSION PATTERN OF MAIZE ZCR GENE 
jN HOQ? TEggUE 

These experiments examined the expression pattern 

of ZCR in the primary root and quiescent centers of maize 

root. The expression pattern was determined by in situ 

hybridization using a ZCR RNA probe, corresponding to an 

amino acid segment region that is highly homologous to a 

corresponding segment of the Arabidopsis SCR protein. The 

experiment was carried out as follows. Restriction fragments 

containing the maize ZCR sequence were isolated from pZCR and 

subcloned into a pBluescript vector for in vitro 

transcription. The probe was synthesized using conditions 

described in the Genius Dig RNA labeling kit. The 

pBluescript plasmid was linearized, and 1 fig was used as a 

template to synthesize digoxigenin-labeled RNA using the T7 

polymerase. The RNA probe was subjected to mild alkali 

hydrolysis by heated at 60 °C for 1 hr in 100 mM carbonate 

buffer (pH 10.2) to yield a probe size of approximately 0.15 

kb. Probe concentration for hybridization was optimized at 1 

Mg/ml/kb. In situ hybridization of root tips from 48 to 72 

hr-old maize seedlings or excised quiescent centers (QCs) of 

roots were carried out following procedures described in 

Section 6.1.6., supra . 

The results show that ZCR expression in maize 

primary roots is localized to a file of cells that is 

identified as the endodermal layer. The expression pattern 

continues in a single uninterrupted file through the QC which 

consists of approximately 1000-1500 cells (FIG. 22) . 

In two-week old regenerating QCs, ZCR expression 

is found in a file of cells extending through the newly 

formed apex. Thus, the regenerated roots exhibits a ZCR 

expression pattern that is similar to that seen in the 

primary root, even though the root apex does not contain the 

normal arrangement of cell files at this stage. 

ZCR expression during regeneration of the root 

apex was also examined. In the initial stages of 

regeneration, cell proliferation occurs to fill in the 
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removed tissue and begins to regenerate the basic shape of 
the root tip. All cells on the blunt edge of the root 
appears to contribute to the new population of cells. The 
ZCR expression pattern indicates that molecular signals are 
5 differentially present in these cells at an early stage in 
regeneration. The gene appears to be diagnostic of cells 
that are preparing to undergo asymmetrical division in order 
to re-establish the normal organization of the root apex from 
the large undifferentiated cells. The results indicate that 
10 ZCR expression is required for pattern formation since it is 
expressed prior to the generation of any specific anatomical 
pattern in the newly formed not tissue. 

11. EXAMPLE 6. EXPRESSION PATTERN OF ZCR 

15 GENE IN SOYBEAN ROOTS AND ROOT NODULES 

SCR expression in soybean roots and nodules was 

examined using in situ hybridization with a SCR probe. The 

procedure used were as described in Sections 6.1.6. and 11. 

In primary roots, SCR is expressed in the 

2Q endodermis. Expression was also found in cells at the root 
tip that are located at the distal end of the endodermal cell 
files. In soybean nodules, expression of SCR was detected in 
the peripheral tissue at the site of developing vascular 
strands. At later stages of vascular development within the 

25 nodu le, SCR expression was found flanking the vascular 
tissue. These results indicate that SCR is involved in 
regulating vascularization in the nodule by contributing to 
the radial organization that is reguired to generate 
endodermis. These findings indicate that SCR promoter may be 

30 used to express proteins in a highly tissue-specific manner 
in soybean nodules. One application is to use SCR promoter 
to engineer nodules through production of components in a 
tissue-specific manner. Another application is that 
modification of the expression of SCR could enhance nodule 

35 activity by improving vascularization and/or the number of 
endodermal layers. 
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12. EXAMPLE 7. SCR EXPRESSION AFFECTS 

GRAVITROPI SM OF AERIAL STRUCTURES 

In addition to being defective in specific 
embryonic and postembryonic neristematic divisions , both the 
scr and the shr mutants have shoots that exhibit severely 
defective gravitropism. Complementation analysis showed that 
scr is allelic to a sgr (shoot gravitropism) mutant, sgrl. 
Four mutant alleles of SCR (i.e., scrl, scr2 , sgrl-1 and 
sgrl-2) have been identified. All four of these mutants have 
normal root gravitropism and defective shoot gravitropism. 

Etiolated hypocotyls of scr mutants placed on 
their sides do not respond to gravity even after 3 hr. 
Similar behaviors were observed with the inflorescence stems 
of sgrl-1 mutant, which do not curve upwards even after two 
days on their sides. In contrast, the roots of these plants 
respond rapidly to the change in orientation with the same 
kinetics as the wild type. Thus, mutations in the SCR gene 
lead to a radial pattern deficiency in the root but have no 
effect on root gravitropism. 

Comparable results were also obtained for shr 

20 

roots and for hypocotyls and inflorescence stems, i.e., data 
indicate that shr shows normal root gravitropism but almost 
no stem gravitropism. 



25 



13. DEPOSIT OF MICROOR GANISMS 

The following microorganisms have been deposited 
in accordance with the terms of the Budapest Treaty with the 
American Type Culture Collection; 12301 Parklawn Drive, 
Rockville, MD 20852, U.S.A., on the dates indicated: 

Accession 

30 Mlcrooroaniam Clfine No^ Date 



DH5a 
DH5a 

35 DH5a 
DH5a 



pGEX-2TK* 98031 April 26, 1996 
(pLIG 1-3/Sac+MOBlSac) 

pNYHl (Zm-.cllb) 98032 April 26, 1996 

PNYH2 (Zm-scll) 98033 April 26, 1996 

pNYH3 <Zm-Bcl2> 98034 April 26, 1996 

OH 5a pZCR April 18, 1997 
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Although the invention is described in detail 
with reference to specific embodiments thereof , it will be 
understood that variations which are functionally equivalent 
are within the scope of this invention. Indeed, various 
modifications of the invention in addition to those shown and 
described herein will become apparent to those skilled in the 
art from the foregoing description and accompanying drawings 
such modifications are intended to fall within the scope of 
the appended claims. 

Various publications are cited herein, each of 
the disclosures of which is incorporated by reference in its 
entirety. 



15 
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35 
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International Application No: PCT/ 

MICROORGANISMS 

Optional Sheet In connection with the rmcrooroeniim roterrad to on page «B6_. lines 2S-37 of the description ' 

A. IDENTIFICATION OF DEPOSIT ' 

Further deposits are identified on an additional eheet 1 

Name of depositary institution ' 
American Type Culture Caatection 



Address of depositary institution (including postal coda and country) 

12301 Perttewn Drive 
Rock villa, MD 20852 
US 



Date of deposit | April 26. 1996 Accession Number • 9BS31 
B. ADDITIONAL INDICATIONS • (tm~ Mm* If mi Stable) n* ■ 



C. DESIGNATED STATES FOR WHICH INDICATIONS ARE MADE ' 






D. SEPARATE FURNISHING OF INDICATIONS * <im Mat if not apptiobie) 


Tha Indication* Itatad twtow yvW pa aubmlttad to tba Intarnationat Buraau (star ' (Spicily tha gant 
"Accaaaion NumtMr of OapoatTI 


vm\ natura of lha tncScalwi* a.g., 



E. S^Thii sheet was received with the International application when filed (to be checked by the receiving Office) 

(Audited Officer) £7 



□ The dale of receipt (from the applicant) by the International Bureau " 



(Authorized Officer) 



form PCT/fiO/1 34 (January 1BU1I 



- 88 - 



WO 97/41152 



PCT/US97/07022 



International Application No: PCT7 / 

Form PCT/RO/1 34 (cont.l 

Amtfictn Typa Cultura Collection 

12301 Parklewn Drive 
RoekvMe, MD 20862 



US 



Accession No. 
98032 



Date of Deposit 
April 26, 1996 
April 26, 1996 
April 26, 1996 
April 18, 1997 



98033 



98034 
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SEQUENCE LISTING 



fl) GENERAL INFORMATIONS 

(i) APPLICANT t Benfey, Phillip N. 

Di Laur«nzio, Laura 
Wysocka-Diller, Joanna 
Mai amy, Jocelyn E. 
Pysh, Leonard 
Helaruitta, Yrjo 

(ii) TITLE OF INVENTION: SCARECROW GENE, PROMOTER AND USES 
THEREOF 

(iii) NUMBER OF SEQUENCES t 67 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Pannie & Edmonds LLP 

(B) STREET: 1155 Avenue of the Americas 

(C) CITY: New York 

(D) STATE: New York 
(E> COUNTRY: USA 
(F) ZIP: 10036-2711 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy diak 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patent In Release #1.0 , Version #1.30 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: US 

(B) FILING DATE: 

(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 08/638,617 

(B) FILING DATE: 26-APR-1996 

(viii) ATTORNEY /AGENT INFORMATION: 

(A) NAME: Coruzzi, Laura A. 

(B) REGISTRATION NUMBER: 30,742 

(C) REFERENCE / DOCKET NUMBER : 005914-0056-999 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (212) 790-9090 

(B) TELEFAX: (212) 869-9741 

(C) TELEX: 66141 PENNIE 



(2) INFORMATION FOR SEQ ID NOrl: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2163 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : unknown 

( D ) TOPOLOGY : unknown 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 
CCTTATTTAT AACCATGCAA TCTCACGACC AACAACCCTT CAATCTCCAT GGCGGAATCC 
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GGCGATTTCA AOGGTGGTCA ACCTCCTCCT CATACTCCTC TGAGAACAAC TTCTT C COCT 120 

AGTAGCAGCA GCAACAACCG TGGTCCTCCT CCTCCTCCTC CTCCTCCTTT AGTGATGGTC 180 

AGAAAAAGAT TACCTTCCGA GATGTCTTCT AACCCTOACT ACAACAACTC CTCTOOTCCT 240 

CCTCGCCGTG TCTCTCACCT TCTTGACTCC AACTACAATA CTGTCACACC ACAACAAOCA 300 

CCGTCTCTTA CGGCGGGGGC TACTGTATCT TCTCAACCAA ACCCACCACT CTCTGTTTGT 360 

GCCTTCTCTG GTCTTCCOGT TTTTCCTTCA GACCGTGGTG GTCGGAATGT TATGATGTCC 420 

GTACAACCAA TGGATCAAGA CTCTTCATCT TCTTCTGCTT CACCTACTGT ATGGGTTGAC 480 

GCCATTATCA GAGACCTTAT CCATTCCTCA ACTTCAGTCT CTATTCCTCA ACTTATCCAA 540 

AACGTTAGAG ACATTATCTT CCCTTGTAAC CCAAATCTCG GTGCTCTTCT TGAATACAGG 600 

CTCCGATCTC TCATGCTCCT TGATCCTTCC TCTTCCTCTG ACCCTTCTCC TCAAACTTTC 660 

GAACCTCTCT ATCAGATCTC CAACAATCCT TCTCCTCCAC AACAGCAACA GCAGCACCAA 720 

CAACAACAAC AACAGCATAA GCCTCCTCCT CCTCCGATTC AGCAGCAAGA AAGAGAAAAT 780 

TCTTCTACCG ATGCACCACC GCAACCAGAG ACAGTGAGGG CCACTGTTCC CGCCGTCCAA 840 

ACAAATACGG CCGAGCCTTT AAGAGAGAGG AAGGAAGAGA TTAAGAGGCA GAAGCAAGAC 900 

GAAGAAGGAT TACACCTTCT CACATTGCTG CTACAGTGTG CTGAAGCTGT CTCTGCTGAT 960 

AATCTCGAAG AAGCAAAGAA GCTTCTTCTT GAGATCTCTC AGTTATCAAC TCCTTACGGG 1020 

ACCTCACCCC AGAGAGTAGC TGCTTACTTC TCGGAAGCTA TGTCAGCGAG ATTACTCAAC 1080 

TCGTGTCTCG CAATTTACGC GGCTTTGCCT TCACGGTGGA TGCCTCAAAC GCATAGCTTG 1140 

AAAATGGTCT CTGCGTTTCA GGTCTTTAAT GGCATAAGCC CTTTAGTGAA ATTCTCACAC 1200 

TTTACAGCGA ATCAGGCGAT TCAAGAAGCA TTTGAGAAAG AAGACAGTGT ACACATCATT 1260 

GACTTGGACA TCATCCAGGG ACTTCAATGG CCTGCTTTAT TCCACATTCT TGCTTCTAGA 1320 

CCTGGAGGAC CTCCACACGT GCGACTCACG GGACTTGGTA CTTCCATGGA AGCTCTTCAG 1380 

GCTACAGGGA AACGTCTTTC GGATTTCACA GATAAGCTTG CCCTGCCTTT TGAGTTCTGC 1440 

CCTTTAGCTC AGAAAGTTGG AAACTTOGAC ACTGAGAGAC TCAATGTGAG GAAAAGGGAA 1500 

OCTGTGGCTG TTCACTGGCT TCAACATTCT CTTTATGATG TCACTGCCTC TGATGCACAC 1S60 

ACTCTCTCGT TACTCCAAAC GTAAAATAAA CATTACCTTT TAATCACTCT TTATCTATAA 1620 

ATTATTTTAA GATTATATAG GAAAGATATG TTCTAAAAAG CTGGCTTTTT TGGTTAATCA 1680 

TTGGGGAATG AACACATTAG CTCCTAAAGT TGTCACAGTA GTCGAGCAAG ATTTGAGCCA 1740 

CGCTGGTTCT TTCTTAGGAA GATTTGTAGA GGCAATACAT TACTACTCTG CACTCTTTGA 1800 

CTCACTOGGA GCAAGCTAOG GCGAAGAGAG TGAAGAGAGA CATGTCGTGG AACAGCAGCT I860 

ATTATCGAAA GAGATACGCA ATGTATTAGC GOTTGCAGGA CCATCGAGAA GCGGTGAAGT 1920 

GAAGTTTGAG AGCTGGAGGG AGAAAATGCA ACAATCTGGG TTTAAACGTA TATCTTTAGC 1980 

TGGAAATGCA GCTACAGAAG CGACTCTACT GTTGGGAATG TTTCCTTCGC ATGGTTACAC 2040 

TTTGCTTGAT GATAATGGTA CACTTAAGCT TGGATGCAAA GATCTTTCCT TACTCACTGC 2100 
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TTCAGCTTGO ACOCCTCOTT CTTAGTTTTC TTCTCCTTTT TCACAAACAA TGTGCCCATA 2160 
AAT 2163 
(2) INFORMATION FOR SBQ ID NOs2t 

(i) SEQUENCE CHARACTERISTICS t 

(A) LENGTH: 653 amino acids 

(B) TYPE i amino acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Met Ala Glu Ser Gly Asp Phe Asn Gly Gly Gin Pro Pro Pro His Ser 
1 5 10 15 

Pro Leu Arg Thr Thr Ser Ser Gly Ser Ser Ser Ser Asn Asn Arg Gly 

20 25 30 

Pro Pro Pro Pro Pro Pro Pro Pro Leu Val Met Val Arg Lys Arg Leu 
35 40 45 

Ala Ser Glu Met Ser Ser Asn Pro Asp Tyr Asn Asn Ser Ser Arg Pro 

50 55 60 

Pro Arg Arg Val Ser His Leu Leu Asp Ser Asn Tyr Asn Thr Val Thr 
65 70 75 80 

Pro Gin Gin Pro Pro Ser Leu Thr Ala Ala Ala Thr Val Ser Ser Gin 
85 90 95 

Pro Asn Pro Pro Leu Ser Val Cya Gly Phe Ser Gly Leu Pro Val Phe 
100 105 110 

Pro Ser Asp Arg Gly Gly Arg Asn Val Met Met Ser Val Gin Pro Met 

115 " " 120 125 

Asp Gin Asp Ser Ser Ser Ser Ser Ala Ser Pro Thr Val Trp Val Asp 
130 135 140 

Ala lie lie Arg Asp Leu lie His Ser Ser Thr Ser Val Ser lie Pro 

145 150 155 160 

Gin Leu lie Gin Asn Val Arg Asp lie lie Phe Pro Cys Asn Pro Asn 

165 170 175 

Leu Gly Ala Leu Leu Glu Tyr Arg Leu Arg Ser Leu Met Leu Leu Asp 
180 185 190 

Pro Ser Ser Ser Ser Asp Pro Ser Pro Gin Thr Phe Glu Pro Leu Tyr 
195 200 205 

Gin lie Ser Asn Asn Pro Ser Pro Pro Gin Gin Gin Gin Gin His Gin 

210 215 220 

Gin Gin Gin Gin Gin His Lys Pro Pro Pro Pro Pro lie Gin Gin Gin 
225 230 235 240 

Glu Arg Glu Asn Ser Ser Thr Asp Ala Pro Pro Gin Pro Glu Thr Val 

245 250 255 
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Thr Ala Thr Val Pro Ala Val Gin Thr Asn Thr Ala Glu Ala Leu Arg 

260 265 270 

Glu Arg Lya Glu Glu lie Lya Arg Gin Lye Gin Asp Glu Glu Gly Leu 
275 280 265 

Hie Leu Leu Thr Leu Leu Leu Gin Cya Ala Glu Ala Val Ser Ala Asp 

290 29S 300 

Aan Leu Glu Glu Ala Aon Lya Leu Leu Leu Glu He Ser Gin Leu Ser 

305 310 315 320 

Thr Pro Tyr Gly Thr Ser Ala Gin Arg Val Ala Ala Tyr Phe Ser Glu 
325 330 335 

Ala Met Ser Ala Arg Leu Leu Aan Ser Cya Leu Gly He Tyr Ala Ala 
340 345 350 

Leu Pro Ser Arg Trp Met Pro Gin Thr Hia Ser Leu Lya Met Val Ser 

355 360 365 

Ala Phe Gin Val Phe Aan Gly He Ser Pro Leu Val Lys Phe Ser Hia 

370 375 3ao 

Phe Thr Ala Aan Gin Ala He Gin Glu Ala Phe Glu Lya Glu Asp Ser 
385 390 395 ' 400 

Val Hia He He Aep Leu Aap He Met Gin Gly Leu Gin Trp Pro Glv 
405 410 415 

Leu Phe Hia He Leu Ala Ser Arg Pro Gly Gly Pro Pro Hia Val Arc 
420 425 430 

Leu Thr Gly Leu Gly Thr Ser Met Glu Ala Leu Gin Ala Thr Gly Lya 
435 440 445 

Arg Leu Ser Aap Phe Thr Aap Lya Leu Gly Leu Pro Phe Glu Phe Cya 
450 455 460 * 

Pro Leu Ala Glu Lya Val Gly Aan Leu Aap Thr Glu Arg Leu Aan Val 
465 470 475 480 

Arg Lya Arg Glu Ala Val Ala Val Hia Trp Leu Gin Hia Ser Leu Tyr 
485 490 495 

Aap Val Thr Gly Ser Aap Ala Hia Thr Leu Trp Leu Leu Gin Arg Leu 
500 505 * 510 

Ala Pro Lya Val Val Thr Val Val Glu Gin Aap Leu Ser Hia Ala Glv 

515 520 525 

Ser Phe Leu Gly Arg Phe Val Glu Ala He Hia Tyr Tyr Ser Ala Leu 
530 535 540 

Phe Aap Ser Leu Gly Ala Ser Tyr Gly Glu Glu Ser Glu Glu Arg Hia 
545 550 555 5 6 o 

Val Val Glu Gin Gin Leu Leu Ser Lya Glu He Arg Aan Val Leu Ala 
565 570 575 

Val Gly Gly Pro Ser Arg Ser Gly Glu Val Lya Phe Glu Ser Trp Ara 
580 585 590 

Glu Lya Met Gin Gin Cya Gly Phe Lys Gly He Ser Leu Ala Gly Aan 

"5 600 60S 

Ala Ala Thr Gin Ala Thr Leu Leu Leu Gly Met Phe Pro Ser Aap Gly 
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610 615 620 

Tyr Thr Leu Val Asp Asp Asn Gly Thr Leu Lye Leu Gly Trp Lye Aep 
625 630 635 640 

Leu Ser Leu Leu Thr Ala Ser Ala Trp Thr Pro Arg Ser 
645 650 

(2) INFORMATION FOR SEQ ID NO: 3s 

<i) SEQUENCE CHARACTERISTICS x 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

(C) STRANDED NESS : unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

Pro Ala Val Gin Thr Asn Thr Ala Glu Ala Leu Arg Glu Arg Lys Glu 
15 10 15 

Glu lie Lys Arg Gin Lys Gin 
20 

(2) INFORMATION FOR SEQ ID NO: 4: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

( B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

Leu Lys Arg Ala Arg Asn Thr Glu Ala Ala Arg Arg Ser Arg Ala Arg 

1*5 10 15 

Lys Leu Gin Arg Met Lys Gin 
20 

(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

Arg Arg Leu Ala Gin Asn Arg Glu Ala Ala Arg Lys Ser Arg Leu Arg 

1 5 10 15 
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Ly« Lye Ala Tyr Val Gin Oln 

20 

(2) INFORMATION FOR SBQ ID NO: 6: 

(1) SEQUENCE CHARACTERISTICS t 

(A) LENGTHS 23 amino acide 

(B) TYPBs amino acid 

(C) STRANDED NESS : unknown 

(D) TOPOLOGY s unknown 

(ii) MOLECULE TYPEs peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:6: 

lie Arg Arg Glu Arg Aan Lya Met Ala Ala Ala Lya Cya Arg Aan Arg 

1 5 io 15 

Arg Arg Glu Leu Thr Aap Thr 

20 

(2) INFORMATION FOR SEQ ID NOs7s 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acida 

(B) TYPEs amino acid 

(C) STRANDED NESS : unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7s 

Arg Lys Arg Met Arg Asn Arg He Ala Ala Ser Lya Cye Arg Lya Arg 

15 10 15 

Lya Leu Glu Arg He Ala Arg 

20 

(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acide 

(B) TYPEs amino acid 

(C) STRANDEDNESSs unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPEs peptide 



(Xi) SEQUENCE DESCRIPTIONS SEQ ID NO: 8: 

Val Arg Leu Met Lye Aan Arg Glu Ala Ala Arg Glu Cye Arg Arg Lys 

Lya Lya Glu Tyr Val Lye Cya 

20 

(2) INFORMATION FOR SBQ ID NO: 9: 



15 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH t 23 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION x SEQ ID NO:9i 

Lys Arg Lya Glu Ser Aen Arg Glu Ser Ala Arg Arg Ser Arg Tyr Arg 

1 5 10 15 

Lys Ala Ala His Leu Lys Glu 

20 

(2) INFORMATION FOR SEQ ID NOilO: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

Met Arg Gin lie Arg Asn Arg Asp Ser Ala Met Lys Ser Arg Glu Arg 

1 5 10 15 

Lys Lys Ser Tyr lie Lys Asp 

20 

(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE : amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

Arg Arg Met Val Ser Asn Arg Glu Ser Ala Arg Arg Ser Arg Lys Lys 
15 10 15 

Lys Gin Ala His Leu Ala Asp 

20 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 amino acids 

( B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 
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(D) TOPOLOGY: unknown 
(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ 10 NO: 12: 

Ala Phe Clu Lye Glu Asp Ser Val Hla lie He Aep Leu Aap He Met 

1 5 10 is 

Gin Gly Leu Gin Trp Pro Gly Leu Phe Hie He Leu Ala Ser Arg Pro 
20 25 30 

Gly Gly Pro Pro Hia Val Arg Leu Thr Gly Leu 
35 40 

(2) INFORMATION FOR SEQ ID NO: 13: 

(i> SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 amino acida 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

Ala Val Lya Aan Glu Ser Phe Val Hia He He Aap Phe Gin He Ser 
1 5 10 15 

Gin Gly Gly Gin Trp Val Ser Leu He Arg Ala Leu Gly Ala Arg Pro 
20 25 30 

Gly Gly Pro Pro Aan Val Arg He Thr Gly He 
35 40 

(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 amino acida 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

Ala Met Glu Gly Glu Lya Met Val Hia Val He Aap Leu Aap Ala Ser 

10 is 

Glu Pro Ala Gin Trp Leu Ala Leu Leu Gin Ala Phe Aan Ser Arg Pro 

20 25 30 

Glu Gly Pro Pro Hia Leu Arg He Thr Gly Val 
35 40 

(2) INFORMATION FOR SEQ ID NO: 15: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 29 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

Ala lie Lye Gly Glu Glu Glu Val Hia lie lie Asp Phe Aap lie Aan 

1 5 10 15 

Gin Gly Asn Gin Tyr Met Thr Leu He Arg Ser He Ala 

20 25 

(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 amino acids 

(B) TYPE: amino acid 

(C) STRAND ED NESS : unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: peptide 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

He Hia Val He Asp Phe Xaa Leu Gly Val Gly Gly Gin Trp Ala Ser 

15 10 15 

Phe Leu Gin Glu Leu Ala His Arg Arg Gly 

20 25 

(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

Val His He He Xaa Phe Xaa Leu Met Gin Gly Leu Gin Trp Pro Ala 

15 10 15 

Leu Met Asp Val Phe Ser Ala Arg Lys Gly Gly Pt Pro Lys Leu Arg 

20 25 30 

He Thr Gly He 
35 

(2) INFORMATION FOR SEQ ID NO: 18: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTHS 1085 base pairs 

(B) TYPE: nuclaic acid 

(C) STRAND EDNESS x unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 
GGCACGAGCC CAACGGCTCC TGAGCTTCTT ACTTATATCC ATATCTTCTA 
CCTTATTTCA AATTCGGTTA TGAATCT6CT AATGGAGCTA TAGCTGAAGC 
CAAAGTTTTG TGCACATTAT CGATTTCCAG ATTTCTCAAG GTGGTCAATG 
ATCCGTGCTC TTGGTGCTAG ACCTGGTGGA CCTCCGAACG TTAGGATAAC 
GATCCGAGAT CATCGTTTGC TCGTCAAGGA GGACTTGAGT TAGTTCGACA 
AAGCTAGCTG AAATGTGOGG TGTTCCGTTT GAGTTCCATG GAGCTGCTTT 
GAAGTOGAAA TCGAGAAGCT AGGAGTTAGA AATGGAGAAG CGCTCGCGGT 
CTTGTTCTTC ACCACATGCC TGATGAGAGT GTAACTGTGG AGAATCACAG 
TTGAGATTGG TCAAACACTT GTCACCAAAC GTTGTGACTC TGGTTGAGCA 
ACAAACACTG CGCCCTTTCT TCCCCGGTTT GTCCAGACAA TGAACCATTA 
TTCGAATCAA TAGATGTGAA ACTCCCTAGA GATCACAAGG AAAGGATCAA 
CATTGTTTGG CTAGAGAGCT TCTGAATCTT ATAGCTTGTG AAGGTGTTGA 
AGGCACGAGC CACTAGGGAA ATGGAGGTCT CGCTTTCACA TGGCGGGATT 
CCTTTGAGCT OGTATGTGAA CCCAACAATC AAAGGATTGC TTGAGAGTTA 
TATACACTTG AAGAAAGAGA TGGAGCATTG TATTTAGGAT GGAAGAATCA 
ACTTCTTGTG CTTCCAGGTA ACTAATAAAA ACCTTGTTCG GTTTCAGAAG 
CTTCTTTTAA AGTTTCCAGA ATCTGTTTGT AAAAGTAAAA CTCATGCATG 
ACAACTTGTC AAATGTTGTA GTAGTAACTG ATATGTTGAT CACCCAAAAA 
AAAAA 

(2) INFORMATION POR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 306 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 
<D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: protain 



TGAAOCCTGC 
TGTGAAGAAC 
GGTGAGTTTG 
GGGAATTGAT 
AAGACTTGGG 
ATGCTGCACG 
TAACTTCCCG 
AGATAGATTG 
AGAAGCGAAT 
CTTGGCAGTT 
TGTTGAGCAG 
AAGAGAAGAG 
TAAACCGTAT 
TTCAGAGAAG 
ACCTCTTATC 
AGATTAGAAA 
ATCCGNAGGA 
AAAAAAAAAA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1085 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

Cly Thr S.r Pro Thr Gly Pro Glu Lau Lau Thr Tyr Mat Hia Ila Lau 



10 is 
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Tyr Clu Ala Cya Pro Tyr Phe Lye Phe Cly Tyr Glu Ser Ala Aan Gly 

20 25 30 

Ala I la Ala Glu Ala Val Lya Aan Glu Ser Pho Val Hia lie lie Aap 
35 40 45 

Pha Gin I la Ser Oln Gly Gly Gin Trp Val Smr Leu He Arg Ala Leu 

50 55 60 

Gly Ala Arg Pro Gly Gly Pro Pro Aan Val Arg He Thr Gly He Aap 
65 70 75 80 

Aap Pro Arg Ser Ser Phe Ala Arg Gin Gly Gly Leu Glu Leu Val Gly 
SS 90 95 

Gin Arg Leu Gly Lya Leu Ala Glu Met Cya Gly Val Pro Phe Glu Phe 
100 105 HO 

Hia Gly Ala Ala Leu Phe Cya Thr Glu Val Glu He Glu Lya Leu Gly 

115 120 125 

Val Arg Aan Gly Glu Ala Leu Ala Val Aan Phe Pro Leu Val Leu Hia 
130 135 140 

Hie Met Pro Aap Glu Ser Val Thr Val Glu Aan Hia Arg Aap Arg Leu 

145 150 155 160 

Leu Arg Leu Val Lya Hie Leu Ser Pro Aan Val Val Thr Leu Val Glu 
165 170 175 

Gin Glu Ala Aan Thr Aan Thr Ala Pro Phe Lau Pro Arg Phe Val Clu 
ISO 185 190 

Thr Met Aan Hia Tyr Leu Ala Val Phe Glu Ser He Aap Val Lya Leu 
195 200 205 

Ala Arg Aap Hia Lye Glu Arg He Aan Val Glu Gin Hia Cya Leu Ala 

210 215 220 

Arg Glu Val Glu Aan Leu He Ala Cya Glu Gly Val Glu Arg Glu Glu 
225 230 235 240 

Arg Hia Glu Pro Leu Gly Lya Trp Arg Ser Arg Phe Hia Met Ala Gly 

245 250 255 

Phe Lya Pro Tyr Pro Leu Ser Ser Tyr Val Aan Ala Thr lie Lya Gly 
260 265 270 

Leu Leu Glu Ser Tyr Ser Glu Lya Tyr Thr Leu Glu Glu Arg Aap Gly 
275 280 285 

Ala Leu Tyr Leu Gly Trp Lya Aan Gin Pro Leu He Thr Ser Cya Ala 

290 295 300 

Trp Arg 

305 

(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1231 baae paira 

(B) TYPE: nucleic acid 

(C) STRANDED NESS : unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: CDNA 
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(xi) SEQUENCE DESCRIPTION! SEQ ID NO:20: 

GCTATGCAAG GAGAGAAGAT GOTTCATGTG ATTGATCTCO ATGCTTCTGA CCCAGCTCAA 60 

TGGCTTGCTT TGCTTCAAGC TTTTAACTCT AGGCCTOAAG GTCCACCTCA TTTGAGAATC 120 

ACTGGTGTTC ATCACCAGAA GGAAGTGCTT GAACAAATGG CTCATAGACT CATTGAGGAA 180 

GCAGAGAAAC TCGATATCCC CTTTCAGTTT AATCCCGTTG TGAOTAGGTT AGACTGTTTA 240 

AATGTAGAAC AGTTGCGGGT TAAAACAGGA GAGGCCTTAG CCGTTAGCTC GGTTCTTCAA 300 

TTGCATACCT TCTTGGCCTC TGATGATGAT CTCATGAGAA AGAACTGCGC TTTACGGTTT 360 

CAGAACAACC CTAGTGGAGT TGACTTGCAG AGAGTTCTAA TGATGAGCCA TGGCTCTGCA 420 

GCTGAGGCAC GTGAGAATGA TATGAGTAAC AACAATGGGT ATAGCCCTAG CGGTGACTCG 480 

GCCTCATCTT TGCCTTTACC AAGTTCAGGA AGGACTGATA GCTTCCTCAA TGCTATTTGG 540 

CGTTTGTCTC CAAACCTCAT GOTGCTCACT GAGCAAGACT CAGACCACAA CGGCTCCACA 600 

CTAATGGAGA GGCTATTAGA ATCACTTTAC ACCTACGCAG CATTGTTTGA TTGCTTGGAA 660 

ACAAAAGTTC CAAOAACGTC TCAAGATAGG ATCAAAGTGC AGAAGATGCT CTTCGGGGAG 720 

GAGATCAAGA ACATCATATC CTGCGAGGGA TTTGAGAGAA GAGAAAGACA CGAGAAGCTT 780 

GAGAAATGGA GCCAGAGGAT CGATTTGGCT GGTTTTGGGA ATCTTCCTCT TAGCTATTAT 840 

GCCATOTTGC AGGCTAGGAG ATTCCTTCAA GGGTGCGGTT TTGATGGGTA TAGAATCAAG 900 

GAAGAGAGCG GCTGCGCAGT AATTTGCTGG CAAGATCGAC CTCTATACTC GGTATCAGCT 960 

TGGAGATGCA GGAAGTGAAT GATATATTAC AGTTTGTCTT CTATTTTGGT TATGAGCAGA 1020 

GTCCCTTTCT TTTTTGTATA CATGGGGACA CAATCTTAGT TGTTTTGTGA TGGTGACTTT 1080 

CTGTCTCTTT ATGCTATTTT GGCTTAAATG CTTCTACTGC CTCTGCATCT AAAGCCTTTG 1140 

TGTGTTGGTT CAATTTGCTC TGGTCTGGCT GTAATACCAA ACCAAATCCA ATTTGAGCTG 1200 

AAGATAACTA ATTTGATGAT CGGCTOGTGC C 1231 
(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32S amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNBSS: unknown 

(D) TOPOLOGY : unknown 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

Ala Met Olu Gly Glu Lya Met Val Hia Val He Asp Leu Aap Ala Ser 

5 10 15 

Olu Pro Ala Gin Trp Leu Ala Leu Leu Gin Ala Phe Aan Ser Arg Pro 

20 25 30 
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Olu Gly Pro Pro Hia Leu Arg He Thr Gly Val His His Gin Lye Glu 
35 40 45 

Val Leu Glu Gin Met Ala His Arg Leu lie Glu Glu Ala Glu Lys Leu 
SO 55 60 

Asp lie Pro Phe Gin Phe Asn Pro Vel Val Ser Arg Leu Asp Cys Leu 

65 70 75 80 

Asn Val Glu Gin Leu Arg Val Lys Thr Gly Glu Ala Leu Ala Val Ser 
85 90 95 

Ser Val Leu Gin Leu His Thr Phe Leu Ala Ser Asp Asp Asp Leu Met 

100 105 110 

Arg Lys Asn Cys Ala Leu Arg Phe His Asn Asn Pro Ser Gly Val Asp 

115 120 125 

Leu Gin Arg Val Leu Met Met Ser His Gly Ser Ala Ala Glu Ala Arg 

130 135 140 

Glu Asn Asp Met Ser Asn Asn Asn Gly Tyr Ser Pro Ser Gly Asp Ser 
145 150 155 160 

Ala Ser Ser Leu Pro Leu Pro Ser Ser Gly Arg Thr Asp Ser Phe Leu 
165 170 175 

Asn Ala lie Trp Gly Leu Ser Pro Lys Val Met Val Val Thr Glu Gin 
180 185 190 

Asp Ser Asp His Asn Gly Ser Thr Leu Met Glu Arg Leu Leu Glu Ser 
195 200 205 

Leu Tyr Thr Tyr Ala Ala Leu Phe Asp Cys Leu Glu Thr Lys Val Pro 
210 215 220 

Arg Thr Ser Gin Asp Arg lie Lys Val Glu Lys Met Leu Phe Gly Glu 
225 230 235 240 

Glu lie Lys Asn He He Ser Cys Glu Gly Phe Glu Arg Arg Glu Arg 
245 250 255 

His Glu Lys Leu Glu Lys Trp Ser Gin Arg He Asp Leu Ala Gly Phe 

260 265 270 

Gly Asn Val Pro Leu Ser Tyr Tyr Ala Met Leu Gin Ala Arg Arg Leu 
275 280 285 

Leu Gin Gly Cys Gly Phe Asp Gly Tyr Arg He Lys Glu Glu Ser Gly 
290 295 300 

Cys Ala val He Cys Trp Gin Asp Arg Pro Leu Tyr Ser Val Ser Ala 

305 310 315 320 

Trp Arg Cys Arg Lys 

325 

(2) INFORMATION FOR SEQ ID NOt22: 

(i) SEQUENCE CHARACTERISTICS s 

(A) LENGTH t 1368 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

<li) MOLECULE TYPE: cDNA 
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(xi) SEQUENCE DESCRIPTIONS SEQ ID NO: 22: 
CTTTGTCAAT GGTAAATGAO CTGAGGCAGA TAOTTTCTAT 
GAATCCCAGC TTACATGGTG GAAGGTCTAG CTGCAAGAAT 
TCTACAGAGC ATTGAAATGC AAAGAGCCTC CTTCGGATGA 
TCCTGTTTGA AGTCTGCCCT TGTTTCAAGT TCGGGTTTTT 
TTGAAGCAAT CAAAGGTGAA GAAGAAGTTC ACATAATCGA 
ACCAATACAT GACACTCATA CGAAGCATTG CTOAGTTGCC 
GGTTAACAGG AATTGATGAC CCTGAATCAG TCCAACGCTC 
TCGGTCTAAG ACTCGAGCAA CTCGCAGAGG ATAATGGAGT 
TGCCTTCAAA GACTTCGATT GTCTCTCCAT CAACACTCGG 
TAATAGTGAA CTTTG CATTC CAACTTCACC ACATGCCTGA 
ACCAGCGGGA CGAGCTACTT CACATGGTCA AAAGCTTAAA 
TTGAACAAGA CGTGAACACA AACACTTCAC CGTTCTTTCC 
AATACTACTC AGCAGTTTTC GAGTCTCTAG ACATGACACT 
GGATGAATGT AGAAAGACAG TGTCTCOCTA GAGACATAGT 
GAGAAGAACG GATAGAGAGA TACGAGGCTG CGGGAAAATG 
CTGGATTCAA TCCAAAACCA ATGAGTGCTA AAGTAACCAA 
AGCAACAATA TTGCAATAAG TACAAGCTTA AAGAAGAAAT 
GGGAGGAGAA AAGCTTAATC GTTGCTTCAG CTTGGAGGTA 
TAGTCTTTAT GTTTCATAAA ACATAATTAT GTTTTTACTG 
ACTGG TT AAA TCATCTCCAT GTATTATTAC CAGAGGTTAG 
AGCTAATCTA ACACTTATGG AAGAATTTTT CTTTCTTTTT 
AATTAGAGTT TTGGTTCTAA ACCTATTTGC TAAGTGTGAA 
TTTCAGTTCA AATGGTTAAA TTTGTTAAGG TTCTCACTTA 
(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 351 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 
<D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: peptide 



CCAAGGAGAC 
GGCCCCTTCA 
GAGGCTTGCA 
AGCAGCTAAT 
TTTCGATATA 
TGGTAAACGA 
CATTGGAGGG 
ATCCTTCAAA 
TTGCAAACCA 
CGAGAGTGTC 
CCCAAAGCTT 
CAGATTCATA 
TCCAAGAGAA 
CAA CATTGTT 
GAGAGCAAGG 
CAATATACAA 
GGG TGAGCTC 
AGATAAGTGA 
TAATCTTGGG 
GGGTGATCAC 
TTTCCCTATT 
TGAGTCTTTA 
AAAAAAAA 



CCTTCTCAGA 
GGAAAATTCA 
GCTATGCAAG 
GGTGCGATAC 
AACCAAGGGA 
CCTCGCCTGA 
CTAAGAATCA 
TTCAAAGCAA 
GGAGAAACCT 
ACAACAGTAA 
GTCACGGTCG 
GAGGCTTACG 
AGCCAAGAGA 
GCTTGCGAAG 
ATGATGATGG 
AACCTGATAA 
CATTTTTGCT 
CAA6AGCATA 
TTATTGTGTA 
AGGTACTAAA 
ATATAAAAAT 
CATGTTCATA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 

1020 

1080 

1140 

1200 

1260 

1320 

1368 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 

Val Ser lie Gin 

15 



J~ ser Met Val A.n Glu Leu Arg Gin lie Val Ser He Gin Gly Aep 

* 10 
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Pro Ser Gin Arg He Ala Ala Tyr Met Val Glu Gly Leu Ala Ala Arg 

20 25 30 

Met Ala Ala Ser Gly Lye Phe He Tyr Arg Ala Leu Lye Cye Lye Glu 
35 40 45 

Pro Pro Ser Asp Glu Arg Leu Ala Ala Met Gin Val Leu Phe Glu Val 

50 55 60 

Cye Pro Cye Phe Lye Phe Gly Phe Leu Ala Ala Aen Gly Ala He Leu 

65 70 75 80 

Glu Ala He Lye Gly Glu Glu Glu Val Hie He He Asp Phe Aep He 
85 90 95 

Aan Gin Gly Aen Gin Tyr Met Thr Leu He Arg Ser He Ala Glu Leu 
100 105 110 

Pro Gly Lye Arg Pro Arg Leu Arg Leu Thr Gly He Aep Aep Pro Glu 

115 120 125 

Ser Val Gin Arg Ser He Gly Gly Leu Arg He He Aen Leu Arg Leu 
130 135 140 

Glu Gin Leu Ala Glu Aep Aen Gly Val Ser Phe Lye Phe Lye Ala Met 
145 150 155 160 

Pro Ser Lye Thr Ser He Val Ser Pro Ser Thr Leu Gly Cye Lye Pro 
165 170 175 

Gly Glu Thr Leu He Val Aen Phe Ala Phe Gin Leu Hie Hla Met Pro 
180 185 190 

Aep Glu Ser Val Thr Thr Val Aen Gin Arg Aep Glu Leu Leu His Met 
195 200 205 

Val Lye Ser Leu Aen Pro Leu Val Thr Val Val Glu Gin Asp Val Aen 

210 215 220 

Thr Aen Thr Ser Pro Phe Phe Pro Arg Phe He Glu Ala Tyr Glu Tyr 
225 230 235 240 

Tyr Ser Ala Val Phe Glu Ser Leu Aep Met Thr Leu Pro Arg Glu Ser 
245 250 255 

Gin Glu Arg Met Aen Val Glu Arg Gin Cye Leu Ala Arg Aep He Val 
260 265 270 

Aen He Val Ala Cye Glu Gly Glu Glu Arg He Glu Arg Tyr Glu Ala 

275 280 285 

Ala Gly Lya Trp Arg Ala Arg Met Met Met Ala Gly Phe Aen Pro Lya 
290 ' 295 3O0 

Pro Met Ser Ala Lye Val Thr Aan Aan He Gin Aen Leu He Lye Gin 

305 310 315 320 

Gin Tyr Cye Aan Lye Tyr Lya Leu Lye Glu Glu Met Gly Glu Leu Hie 
325 330 335 

Phe Cye Trp Glu Glu Lye Ser Leu He Val Ala Ser Ala Trp Arg 
340 345 350 

(2) INFORMATION FOR SBQ ID NOi24: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 100 baae palra 
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(B) TYPE: nucleic acid 

(C) STRANDEDMESS : single 

(D) TOPOLOGY i unknown 

(ii) MOLECULE TYPE. CONA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 
CGAGGAGGCG TTCOAGCGGG AOGAGCGTGT GCACATCATC GACCTCGACA TCATGCAGGG 
GCTGCAGTGG CCGGGCCTCC TCCACATOCT TGCCTCCCGC 
(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 33 amino acids 

(B) TYPE: amino acid 

(C) STRANDED NESS : unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE i peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 

Gin Glu Ala Phe Glu Arg Glu Glu Arg Val His He He Aep Leu Aep 

He Met Gin Gly Leu Gin Trp Pro Gly Leu Phe Hie lie Leu Ala Ser 
20 25 30 

Arg 

(2) INFORMATION FOR SEQ ID NO:26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1094 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 



CCACGCCTCC GTCAAAGGAT ACAACCATCT ACACATAATT 


GACTTTTCCC 


TGATGCAAGG 


60 


TCTCCAGTGG CCGGCACTCA TGGATGTCTT 


CTCCGCCCGT 


GAGGGTGGGC 


CACCAAAGCT 


120 


CCCAATCACA GGCATTGGCC CGAACCCAAT 


AGGTGGCCGT 


GAOGAGCTCC 


ATGAAGTGGG 


160 


AATTCGCCTC GCCAAGTATG CACACTCGGT 


GGGTATCGAC 


TTCACTTTCC 


AGGGAGTCTG 


240 


TGTCGATCAA CTTGATAGGT TGTGCGACTG 


GATGCTTCTC 


AAACCAATCA 


AAGGAGAGGC 


300 


AGTTGCCATA AACTCCATCC TACAACTCCA 


TCGCCTCCTC GTTGACCCAG 


ATGCAAACCC 


360 


AGTGGTGCCC GCACCAATAG ATATCCTCCT 


CAAATTGGTC 


ATCAAGATAA 


ACCCCATGAT 


420 
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CTTCACGGTG 


GTTGAGCATG 


AGGCAGATCA 


CAACAGACCA 


CCACTACTAG 


AG AGGT TC AC 


480 


TAATGCCCTC 


TTCCACTATG 


OGACCATGTT 


TGACTCTTTG 


GAGGCCATGC 


ATCGTTGTAC 


540 


CAGTGGTAGA 


GACATCACCG 


ACTCACTCAC 


AGAGGTGTAC 


CTTCGAGGTG 


AGATTTTTGA 


600 


CATTGTCTCC 


GGOGAGGGCA 


GTGGACGCAC 


OGAAOGTGAT 


GAGTTGTTTG 


GTCACTGGAG 


660 


GGAGAGGCTC 


ACCTATGCTG 


GCCTAACTCA 


AG TGTGGTTC 


GACCCCGATG 


AGGTTGACAC 


720 


GCTAAAAGAC 


CAGTTGATCC 


ATGTGACATC 


CTTATCTGGC 


TCTGGGTTCA 


ACATCCTAGT 


780 


GTGTGATGGC 


AGCCTTGCAC 


TAGCGTGGCA 


TAATCGCCCG 


T T AT ATGTGG 


CAACAGCTTG 


640 


GTGTGTGACA 


GGAGGAAATG 


CTGCCAGTTC 


CATGCTTGGC 


AACATCTGTA 


AGGGTACAAA 


900 


TGATAGTAGA 


AGAAAGGAAA 


ACCGTAATGG 


ACCCATGGAG 


TAGCAGGAAG 


AATAACCATG 


960 


TCATGAGCAA 


ATOGATCAAG 


TAATAAAATG 


CACTGATGAC 


ATGCATGGTG 


ATCTAAAGTT 


1020 


TTTTTGCGTG 


AATGTGCAAT 


GACGAATTGT 


TCAATTTGAA 


TAACCTAATC 


ATGAGACTCA 


1080 


AAAAAAAAAA 


AAAA 










1094 


(2) INFORMATION FOR SEQ ID NO: 27 











(i) SEQUENCE CHARACTERISTICS i 

(A) LENGTH: 313 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNSSS: unknown 

(D) TOPOLOGY : unknown 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 

His Ala Ser Val Lya Gly Tyr Asn Hie Val His lie lie Asp Phe Ser 

1 5 10 15 

Leu Met Gin Gly Leu Gin Trp Pro Ala Leu Met Asp Val Phe Ser Ala 
20 25 30 

Arg Glu Gly Gly Pro Pro Lys Leu Arg lie Thr Gly lie Gly Pro Asn 
35 40 45 

Pro lie Gly Gly Arg Asp Glu Leu His Glu Val Gly lie Arg Leu Ala 
50 55 60 

Lys Tyr Ala His Ser Val Gly lie Asp Phe Thr Phe Gin Gly Val Cys 

65 70 75 80 

Val Asp Gin Leu Asp Arg Leu Cys Asp Trp Met Leu Leu Lys Pro lie 
85 90 95 

Lys Gly Glu Ala Val Ala He Asn Ser He Leu Gin Leu His Arg Leu 
100 105 110 

Leu Val Asp Pro Asp Ala Asn Pro Val Val Pro Ala Pro He Asp He 

115 120 125 

Leu Leu Lys Leu Val He Lys He Asn Pro Met He Phe Thr Val Val 

130 135 140 

Glu His Glu Ala Asp His Asn Arg Pro Pro Leu Leu Glu Arg Phe Thr 
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145 150 155 160 

Asn Ala Leu Phe His Tyr Ala Thr Met Phe Aap Ser Leu Glu Ala Met 

165 170 175 

His Arg Cys Thr Ser Gly Arg Asp lie Thr Asp Ser Leu Thr Glu Val 
180 185 190 

Tyr Leu Arg Gly Glu He Phe Asp lis Val Cys Gly Glu Gly Ser Ala 
195 200 205 

Arg Thr Glu Arg His Glu Leu Phe Gly His Trp Arg Glu Arg Leu Thr 
210 215 220 

Tyr Ala Gly Leu Thr Gin Val Trp Phe Asp Pro Asp Glu Val Asp Thr 
225 230 235 240 

Leu Lys Asp Gin Leu He His Val Thr Ser Leu Ser Gly Ser Gly Phe 
245 250 255 

Asn He Leu Val Cys Asp Gly Ser Leu Ala Leu Ala Trp His Asn Arg 
260 265 270 

Pro Leu Tyr Val Ala Thr Ala Trp Cys Val Thr Gly Gly Asn Ala Ala 

275 280 285 

Ser Ser Met Val Gly Asn He Cys Lys Gly Thr Asn Asp Ser Arg Arg 
290 295 300 

Lys Glu Asn Arg Asn Gly Pro Met Glu 

305 310 

(2) INFORMATION FOR SEQ ID NO: 28: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 611 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND ED NESS : unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 



CCCAACTTGG 


GAAGCCCTTC 


CTCCGCTCCG 


CCTCCTACCT 


CAAGGAGGCC 


CTCCTCCTCG 


60 


CACTCGCCGA 


CAGCCACCAT 


GGCTCCTCCG 


GCGTCACCTC 


GCCGCTCGAC 


GTTGCCCTCA 


120 


AGCTTGCAGC 


ATACAAGTCT 


TTCTCTGACC 


TGTCACCTGT 


GCTCCAGTTC 


ACTAACTTTA 


180 


CCGCAACAAG 


GCGCTTCTTG 


ATGAGATTGG 


TGGCATGGCA 


ACTTCCTGCA 


TCCATGTCAT 


240 


TGACTTTGAT 


CTCGGTGTTG 


GTGGTCAGTG 


GGCTTCCTTC 


TTGCAGGAGC 


TTGCCCACCG 


300 


CCGGGGAGCT 


GGAGGTATGG 


CCTTGCCGTT 


GTTGAAGCTC 


ACGG CTTTC A 


TGTCGACTGC 


360 


TTCTCACCAT 


CCACTGGAGC 


TGCACCTTAC 


CCAGGATAAC 


CTCTCTCAGT 


TTGCCGCAGA 


420 


GCTCAGAATT 


CCTTTCGAAT 


TCAATGCCGT 


CAGTCTTGAT 


GCATTCAATC 


CTGCGGAATC 


480 


TATTTCTTCC 


TCTGGT0AT0 


AAGTTGTTGC 


TGTTAGCCTC CCTGTTGGCT GCTCTGCTCG 


540 


TGCACCACCG 


CTGCCAGCGA 




GGTGAAACAG 


CTTTGTCCTA AGGTTGTCGT 


600 
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GGCTATTGAT C 611 
(2) INFORMATION FOR SEQ ID NO: 29: 

(1) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 502 base pairs 

(B) TYPES nucleic acid 

(C) STRANDEDNBSSx unknown 

(D) TOPOLOGY! unknown 

(ii) MOLECULE TYPE i CDNA 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 29: 

TTTTTTTTT T TTTTTTTTTT T TTTTTTTTT TACAGAGCAA CAGCAGTATA ATATTAATTC 60 

TGTACCACAC AACCATTTGA TAGGTTAAAT TACCCTCTAG TCTCTACTCA TAAGCAGTGT 120 

TTCCAATGAG ATGATCATGG CTAATTGAGC AGAGCATGGC AACAACCTAA AGCAACATCA 180 

TTAGCTATAG AGACTGACAC CAATATTCCT AAATCCACTA GGCTAGCTAA TAAGCTGCAA 240 

CGAAAAGCAA TATGAAGAGT TCAACAGCTC AAGACAACAA TTTCATTTGC AACATTTAAT 300 

TGCAAGAATA AATGGACATT ACTGGAGTGG TCGATGCTTG CAAACGGTGG TGGAACCTTG 360 

GTGGAGTGAA GCTTATGGCT GATCAGCACC GCCAAGATGA TATGGATACA AGCTCCCCAC 420 

GCTGCCAGTA GAGCGTAAGA GCAGCTCCGC GTTTCTCCAC ATGGAATCCT CGGACCTGCA 480 

CCCGCTTCAG GAGGCAGTCT GC 5 °2 
(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 298 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 

Pro Gin Gin Gin Gin Gin His Gin Gin Gin Gin Gin Gin His Lys Pro 
15 10 15 

Pro Pro Pro Pro He Gin Gin Gin Glu Arg Glu Asn Ser Ser Thr Asp 
20 25 30 

Ala Pro Pro Gin Pro Glu Thr Val Thr Ala Thr Val Pro Ala Val Gin 
35 40 45 

Thr Asn Thr Ala Glu Ala Leu Arg Glu Arg Lys Glu Glu He Lys Arg 

50 55 60 

Gin Lys Gin Asp Glu Glu Gly Leu His Leu Leu Thr Leu Leu Leu Gin 
65 70 75 80 

Cvs Ala Glu Ala Val Ser Ala Asp Asn Leu Glu Glu Ala Asn Lys Leu 

* 85 90 95 
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Leu Leu Glu He Ser Gin Leu Ser Thr Pro Tyr Gly Thr Ser Ala Gin 
100 105 110 

Arg Val Ala Ala Tyr Phe Ser Glu Ala Met Ser Ala Arg Leu Leu Asn 
115 120 125 

Ser Cya Leu Gly He Tyr Ala Ala Leu Pro Ser Arg Trp Met Pro Gin 
130 135 140 

Thr His Ser Leu Lye Met Val Ser Ala Phe Gin Val Phe Aen Gly He 
145 150 155 160 

Ser Pro Leu Val Lye Phe Ser Hie Phe Thr Ala Asn Gin Ala He Gin 

165 170 175 

Glu Ala Phe Glu Lye Glu Asp Ser Val His He He Asp Leu Asp He 
180 185 190 

Met Gin Gly Leu Gin Trp Pro Gly Leu Phe His He Leu Ala Ser Arg 
195 200 205 

Pro Gly Gly Pro Pro His Val Arg Leu Thr Gly Leu Gly Thr Ser Met 
210 215 220 

Glu Ala Leu Gin Ala Thr Gly Lys Arg Leu Ser Asp Phe Thr Asp Lys 

225 230 235 240 

Leu Gly Leu Pro Phe Glu Phe Cys Pro Leu Ala Glu Lys Val Gly Asn 
245 250 255 

Asp Leu Thr Glu Arg Leu Asn Val Arg Lys Arg Glu Ala Ala Val His 
260 265 270 

Trp Leu Gin His Ser Leu Tyr Asp Val Thr Gly Ser Asp Ala His Thr 
275 280 285 

Leu Trp Leu Leu Gin Arg Leu Ala Pro Lys 
290 295 

(2) INFORMATION FOR SBQ ID NOr31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 307 amino acids 

(B) TYPE: amino acid 

(C) STRANDBDNESS : single 

(D) TOPOLOGY: unknown 

<ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 

Gly Thr Ser Pro Thr Gly Pro Glu Leu Leu Thr Tyr Met His He Leu 
15 10 is 

Tyr Glu Ala Cys Pro Tyr Phe Lys Phe Gly Tyr Glu Ser Ala Asn Gly 
20 25 30 

Ala He Ala Glu Ala Val Lys Asn Glu Ser Phe Val His He He Asp 
35 40 45 

Phe Gin He Ser Gin Gly Gly Gin Trp Val Ser Leu He Arg Ala Leu 

SO 55 60 

Gly Ala Arg Pro Gly Gly Pro Pro Asn Val Arg He Thr Gly He Asp 
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65 70 75 80 

Asp Pro Arg Ser Ser Phe Ala Arg Gin Gly Gly Leu Glu Leu Val Gly 
85 90 95 

Gin Arg Leu Gly Lye Leu Ala Glu Met Cya Gly Val Pro Phe Glu Phe 
100 105 110 

Hia Gly Ala Ala Leu Cya Cya Thr Glu Val Glu lie Glu Lya Leu Gly 
115 120 125 

Val Arg Aan Gly Glu Ala Leu Ala Val Aan Phe Pro Leu Val Leu Hia 
130 135 140 

Hia Met Pro Asp Glu Ser Val Thr Val Glu Aan Hia Arg Aap Arg Leu 
145 150 155 160 

Leu Arg Leu Val Lya Hia Leu Ser Pro Aan Val Val Thr Leu Val Glu 
165 170 175 

Gin Glu Ala Aan Thr Aan Thr Ala Pro Phe Leu Pro Arg Phe Val Glu 
180 185 190 

Thr Met Aan Hia Tyr Leu Ala Val Phe Glu Ser lie Aap Val Lya Leu 
195 200 205 

Ala Arg Aap Hia Lya Glu Arg He Aan Val Glu Gin Hia Cya Leu Ala 
210 215 220 

Arg Glu Val Val Aan Leu He Ala Cya Glu Gly Val Glu Arg Glu Glu 
225 230 235 240 

Arg Hia Glu Pro Leu Gly Lya Trp Arg Ser Arg Phe Hia Met Ala Gly 
245 250 255 

Phe Lya Pro Tyr Pro Leu Ser Ser Tyr Val Aan Ala Thr He Lya Gly 
260 265 270 

Leu Leu Glu Ser Tyr Ser Glu Lya Tyr Thr Leu Glu Glu Arg Aap Gly 
275 280 285 

Ala Leu Tyr Leu Gly Trp Lya Aan Gin Pro Leu He Thr Ser Cya Ala 
290 295 300 

Trp Arg Xaa 

305 

(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 353 amino acida 

(B) TYPE: amino acid 

(C) STRANDBDNESS : a ingle 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 

Leu Ser Met Val Aan Glu Leu Arg Gin He Val Ser He Gin Gly Aap 
15 10 15 

Pro Ser Gin Arg He Ala Ala Tyr Met Val Glu Gly Leu Ala Ala Arg 

20 25 30 
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Met Ala Ala ser Oly Lye Phe lie Tyr Arg Ala Leu Lye Cya Lye Glu 
35 40 45 

Pro Pro Ser Aep Glu Arg Leu Ala Ala Met Gin Val Leu Phe Glu Val 
50 55 60 

Cya Pro Cya Phe Lye Phe Oly Phe Leu Ala Ala Aan Gly Ala lie Leu 

65 70 75 80 

Glu Ala lie Lya Gly Glu Glu Glu Val Hla He He Aap Phe Aap He 
85 90 95 

Aan Gin Oly Asn Gin Tyr Met Thr Leu He Arg Ser He Ala Glu Leu 

100 105 no 

Pro Gly Lya Arg Pro Arg Leu Arg Leu Thr Gly He Aap Aap Pro Glu 
115 120 125 

Ser Va J cln Ar 9 Ser He Gly oly Leu Arg He Il« Gly Leu Arg Leu 

Glu Gin Leu Ala Glu Aap Aan Gly Val Ser Phe Lya Phe Lya Ala Met 

145 150 155 180 

Pro Ser Lye Thr Ser He Val Ser Pro Ser Thr Leu Gly Cya Lye Pro 
155 170 175 

Gly Olu Thr Leu He Val Aen Phe Ala Phe Gin Leu Hie Hie Met Pro 
180 185 190 

Aep Glu Ser Val Thr Thr Val Aan Gin Arg Aep Glu Leu Leu Hie Met 
195 200 205 

Val Ser L * U Asn pro Leu v «l Thr Val val Glu Gin Aap Val 

21° 215 220 

Aen Thr Aen Thr Ser Pro Phe Phe Pro Arg Phe He Glu Ala Tyr Glu 
225 230 235 240 

Tyr Tyr Ser Ala Val Phe Glu Ser Leu Aep Met Thr Leu Pro Arg Glu 
245 250 255 

Ser Gin Glu Arg Met Aen Val Glu Arg Gin Cye Leu Ala Arg Aap He 
2*0 265 270 

Val Aen He Val Ala Cye Glu Gly Glu Glu Arg He Glu Arg Tyr Glu 
275 280 285 

A1 * oi« ° ly Ly ' Trp Ala Aro Met Mat Met A *» Cly Phe Aen Pro 

290 295 300 

Lye Pro Met Sar Ala Lye Val Thr Aen Aen He Gin Aen Leu He Lye 

305 310 315 320 

Cln Gin Tyr Cye Aen Lye Tyr Lye Leu Lye Glu Glu Met Gly Olu Leu 

325 330 * 335 

Hie Phe Cye Trp Glu Clu Lye Ser Leu He Val Ala Ser Ala Trp Arc 
3 *0 345 350 * 



) INFORMATION FOR SBQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 326 amino acide 



- 111 - 



WO 97/41152 



PCT7US97/07022 



(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
(0) TOPOLOGY: unknown 

(ii) MOLECULE TYPE x peptide 



(xi) SEQUENCE DESCRIPTION : SBQ ID NO: 33: 

Ala Met Clu Gly Glu Lye Met Val His Val lie Asp Leu Asp Ala Ser 

1 5 10 15 

Glu Pro Ala Gin Trp Leu Ala Leu Leu Gin Ala Pha Asn Ser Arg Pro 

20 25 30 

Glu Gly Pro Pro His Leu Arg lie Thr Gly Val His His Gin Lye Glu 
35 40 45 

Val Leu Glu Gin Met Ala His Arg Leu lie Glu Glu Ala Glu Lys Leu 
50 55 60 

Asp lie Pro Phe Gin Phe Asn Pro Val Val Ser Arg Leu Asp Cys Leu 
65 70 75 80 

Asn Val Glu Gin Leu Arg Val Lys Thr Gly Glu Ala Leu Ala Val Ser 
85 90 95 

Ser Val Leu Gin Leu His Thr Phe Leu Ala Ser Asp Asp Asp Leu Met 
100 105 110 

Arg Lys Asn Cys Ala Leu Arg Phe Gin Asn Asn Pro Ser Gly Val Asp 
115 120 125 

Leu Gin Arg Val Leu Met Met Ser His Gly Ser Ala Ala Glu Ala Arg 
130 135 140 

Glu Asn Asp Met Ser Asn Asn Asn Gly Tyr Ser Pro Ser Gly Asp Ser 
145 150 155 160 

Ala Ser Ser Leu Pro Leu Pro Ser Ser Gly Arg Thr Asp Ser Phe Leu 
165 170 175 

Asn Ala lie Trp Gly Leu Ser Pro Lys Val Met Val Val Thr Glu Gin 
180 185 190 

Asp Ser Asp His Asn Gly Ser Thr Leu Met Glu Arg Leu Leu Glu Ser 
195 200 205 

Leu Tyr Thr Tyr Ala Ala Leu Phe Asp Cys Leu Glu Thr Lys Val Pro 
210 215 220 

Arg Thr Ser Gin Asp Arg lie Lys Val Glu Lys Met Leu Phe Gly Glu 

225 230 235 240 

Glu He Lys Asn He He Ser Cys Glu Gly Phe Glu Arg Arg Glu Arg 
245 250 255 

His Glu Lys Leu Glu Lys Trp Ser Gin Arg He Asp Leu Ala Gly Phe 
260 265 270 

Gly Asn Val Pro Leu Ser Tyr Tyr Ala Met Leu Gin Ala Arg Arg Leu 
275 280 285 

Leu Gin Gly Cys Gly Phe Asp Gly Tyr Arg He Lys Glu Glu Ser Gly 
290 295 300 
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Cys Ala Val He Cys Trp Gin Asp Arg Pro Leu Tyr Ser Val Ser Ala 

305 310 315 320 

Trp Arg Cys Arg Lya Xaa 

325 

(2) INFORMATION FOR S£Q ID NO:34: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 277 amino acids 

(B) TYPEz amino acid 

(C) STRANDEDNESS: single 
(0) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 

Aan Lye Arg Leu Lye Ser Cya Ser Ser Pro Asp Ser Met Val Thr Ser 

1 5 10 15 

Thr Ser Thr Gly Thr Gin He Gly Gly Val He Gly Thr Thr Val Thr 
20 25 30 

Thr Thr Thr Thr Thr Thr Thr Ala Ala Ala Glu Ser Thr Arg Ser Val 
35 40 45 

He Leu Val Aep Ser Gin Glu Aan Gly Val Arg Leu Val Hie Ala Leu 

50 55 60 

Met Ala Cya Ala Glu Ala He Gin Gin Aan Asn Leu Thr Leu Ala Glu 
65 70 75 80 

Ala Leu Val Lya Gin He Gly Cys Leu Ala Val Ser Gin Ala Gly Ala 
85 90 95 

Met Arg Lya Val Ala Thr Tyr Phe Ala Glu Ala Leu Ala Arg Arg He 
100 105 HO 

Tyr Arg Leu Ser Pro Pro Gin Aen Gin He Asp His Cys Leu Ser Asp 
115 120 125 

Thr Leu Gin Met His Phe Tyr Glu Thr Cys Pro Tyr Leu Lys Phe Ala 
130 135 140 

His Phe Thr Ala Asn Gin Ala He Leu Glu Ala Phe Glu Gly Lys Lys 
145 150 155 160 

Arg Val His Val He Asp Phe Ser Met Asn Gin Gly Leu Gin Trp Pro 
165 170 175 

Ala Leu Met Gin Ala Leu Ala Leu Arg Glu Gly Gly Pro Pro Thr Phe 
180 185 190 

Arg Leu Thr Gly He Gly Pro Pro Ala Pro Asp Asn Ser Asp His Leu 

195 200 205 

His Glu Val Gly Cys Lys Leu Ala Gin Leu Ala Glu Ala He His Val 

210 215 220 

Glu Phe Glu Tyr Arg Gly Phe Val Ala Asn Ser Leu Ala Asp Leu Asp 

225 230 235 240 

Ala Ser Met Leu Glu Leu Arg Pro Ser Asp Thr Glu Ala Val Ala Val 
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245 250 2S5 

Aan Ser Val Phe Glu Leu His Lys Leu Leu Gly Arg Xaa Gly Gly He 

260 265 270 

Glu Lys Val Leu Gly 

275 

(2) INFORMATION FOR SEQ ID NO* 35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTHS 262 amino acids 

(B) TYPEt amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY x unknown 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION s SEQ ID NO: 35: 

Gly Gly Gly Gly Asp Thr Tyr Thr Thr Asn Lya Arg Leu Lye Cya Ser 
15 10 15 

Aan Gly Val Val Glu Thr Thr Thr Ala Thr Ala Glu Ser Thr Arg Hie 
20 25 30 

Val Val Leu Val Asp Ser Gin Glu Aan Gly Val Arg Leu Val Hia Ala 
35 40 45 

Leu Leu Ala Cya Ala Glu Ala Val Gin Lya Glu Asn Leu Thr Val Ala 
50 55 60 

Glu Ala Leu Val Lys Gin lie Gly Phe Leu Ala Val Ser Gin lie Gly 
65 70 75 80 

Ala Met Arg Gin Val Ala Thr Tyr Phe Ala Glu Ala Leu Ala Arg Arg 
85 90 95 

lie Tyr Arg Leu Ser Pro Ser Gin Ser Pro lie Asp Hie Ser Leu Ser 
100 105 110 

Asp Thr Leu Gin Met His Phe Tyr Glu Thr Cya Pro Tyr Leu Lys Phe 
115 120 125 

Ala His Phe Thr Ala Asn Gin Ala lie Leu Glu Ala Phe Gin Gly Lys 
130 135 140 

Lya Arg Val His Val lie Asp Phe Ser Met Ser Gin Gly Leu Gin Trp 
145 150 155 160 

Pro Ala Leu Met Gin Ala Leu Ala Leu Arg Pro Gly Gly Pro Pro Val 
165 170 175 

Phe Arg Leu Thr Gly He Gly Pro Pro Ala Pro Aap Asn Phe Asp Tyr 
180 185 190 

Leu His Glu Val Gly Cya Lys Leu Ala His Leu Ala Glu Ala He His 
195 200 205 

Val Glu Phe Glu Tyr Arg Gly Phe Val Ala Aan Thr Leu Ala Aap Leu 

210 215 220 

Aap Ala Ser Met Leu Glu Leu Arg Pro Ser Glu He Glu Ser Val Ala 
225 230 235 240 
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Val Asn Ser Val Phe Glu Leu His Lye Leu Leu Gly Arg Pro Gly Ala 

245 250 255 

lie Asp Lye Vel Leu Gly 
260 

(2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 203 amino acide 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 

Gin Leu Gly Lye Pro Phe Leu Arg Ser Ala Ser Tyr Leu Lye Glu Ala 
1 5 10 is 

Leu Leu Leu Ala Leu Ala Aep Ser Hia Hie Gly Ser Ser Gly Val Thr 

20 25 30 

Ser Pro Leu Asp Val Ala Leu Lye Leu Ala Ala Tyr Lye Ser Phe Ser 
35 40 45 

Aep Leu Ser Pro Val Leu Gin Phe Thr Asn Phe Thr Ala Aen Lye Ala 
50 55 60 

Leu Leu Aep Glu He Gly Gly Met Ala Thr Ser Cy« He His Val He 
65 70 75 80 

Aep Phe Aen Leu Gly Val Gly Gly Gin Trp Ala Ser Phe Leu Gin Glu 
65 90 95 

Leu Ala Hie Arg Arg Gly Ala Gly Gly Met Ala Leu Pro Leu Leu Lye 
100 105 HO 

Leu Thr Ala Phe Met Ser Thr Ala Ser Hie His Pro Leu Glu Leu Hie 

115 120 125 

Leu Thr Gin Aep Asn Leu Ser Gin Phe Ala Ala Glu Leu Arg He Pro 

"0 135 140 

Phe Glu Phe Aen Ala Val Ser Leu Asp Ala Phe Aen Pro Ala Glu Ser 
145 150 155 160 

He Ser Ser Ser Gly Aep Glu Val Val Ala Val Ser Leu Pro Val Gly 
165 170 175 

Cye Ser Ala Arg Ala Pro Pro Leu Pro Ala He Leu Arg Leu Val Lve 
ISO 185 190 

Gin Leu Cye Pro Lye Val Val Val Ala He Asp 

195 200 

(2) INFORMATION FOR SBQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 131 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: unknown 



- 115 - 



WO 97/41152 



PCT/US97/07022 



(ti) MOLECULE TYPE i peptide 



(xi) SEQUENCE DESCRIPTIONS SEQ ID NOi37x 

His Ala Ser Val Lys Gly Tyr Asn His Val His lie lis Asp Phe Ssr 

1 5 10 15 

Leu Met Gin Gly Leu Gin Trp Pro Ala Leu Met Asp Val Phe Ser Ala 

20 25 30 

Arg Glu Gly Gly Pro Pro Lye Leu Arg lie Thr Gly He Gly Pro Asn 
35 40 45 

Pro He Gly Gly Arg Asp Glu Leu His Glu Val Gly lie Arg Leu Ala 
50 55 60 

Lys Tyr Ala Hie Ser Val Gly He Asp Phe Thr Phe Gin Gly Val Cys 
65 70 75 80 

Val Asp Gin Leu Asp Arg Leu Cys Asp Trp Met Leu Leu Lys Pro He 
85 90 95 

Lys Gly Glu Ala Val Ala He Asn Ser He Leu Gin Leu His Arg Leu 
100 105 110 

Leu Val Asp Pro Asp Ala Asn Pro Val Val Pro Ala Pro He Asp He 

115 120 125 

Leu Leu Lys 
130 

(2) INFORMATION FOR SEQ ID NO; 38: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 33 amino acids 

(B) TYPEs amino acid 

(C) STRANDEDNESSs single 

( D ) TOPOLOGY : unknown 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38s 

Gin Glu Ala Phe Glu Arg Glu Glu Arg Val His He He Asp Leu Asp 
1 5 10 15 

He Met Gin Gly Leu Gin Trp Pro Gly Leu Phe His He Leu Ala Ser 
20 25 30 

Arg 



(2) INFORMATION FOR SEQ ID NO s 39s 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTHS 29 amino acids 

(B ) TYPEs amino acid 

(C) STRANDEDNESSs single 

(D) TOPOLOGY s unknown 

(ii) MOLECULE TYPE: peptide 
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(xi) SEQUENCE DESCRIPTION; SEQ ID NO* 39: 

Phe Ala Gly Cys Arg Arg Val His Val Val Asp Phe Gly He Lys Gin 
15 10 15 

Gly Mat Gin Trp Pro Ala Lau Leu Xaa Asp Leu Ala Leu 

20 25 

(2) INFORMATION FOR SEQ ID NO: 40 : 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 73 amino acids 

(B) TYPE: amino acid 

(C) 5TRANDEDNESS: a ingle 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 

Gly Arg Asn Gly Arg Thr Leu Trp Leu Gly Glu Gly His He Asp Leu 
1 5 10 15 

Trp Pro Leu Gin Gly Leu Leu Ser Gin Gly Leu Gin Arg Ala Leu Cys 
20 25 30 

Ala Arg Pro Leu Gly Ala Pro His Val Phe Leu Pro Gly Leu His Thr 
35 40 45 

Leu Ser Leu Gly Leu Gin Xaa Arg His Leu Leu Val His Met Met Ala 
50 55 60 

Leu Ser Tyr Ser Tyr Gly Arg Xaa Pro 
65 70 

(2) INFORMATION FOR SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 59 amino acids 

(B) TYPE: amino acid 

(C) STRANDED NESS : single 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:41: 

Thr Ser Asp Ser Ala Ser Ser Phe Asn lie Pro Thr Ser Ala Gin Asn 
1 s 10 is 

His Tyr Ala Thr Gly Ser Phe Ser Thr Asn Ser Arg Thr Thr Asn Val 
20 25 30 

Ala Thr Ala Thr Thr Asn Ser Ala Thr Ala His Trp Val Ala Thr Asp 
35 40 45 

Ala Glu Hie Thr Asp Thr He He Ala Gin Pro 

50 55 
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(2) INFORMATION FOR SBQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 110 amino acids 

(B) TYPEs amino acid 

(C) ST HANDEDNESS t single 
(0) TOPOLOGY t unknown 

(11) MOLECULE TYPES peptide 



(xi) SEQUENCE DESCRIPTION : SEQ ID NOs42: 

Arg Xaa Phe Asp Ser Leu Glu His Asp Ala Ser Lys Gly Glu Pro Arg 

15 10 15 

Glu Asp Glu Arg Gly Arg Xaa Cys Lsu Ala Arg Asn He Val Asn He 

20 25 30 

Val Xaa Cys Lys Xaa Glu Glu Arg He Glu Arg Tyr Glu Val Thr Gly 
35 40 45 

Lys Trp Arg Ala Arg Met Met Met Ala Gly Phe Ser Pro Arg Pro Met 

50 55 60 

Ser Gly Arg Val Thr Ser Asn He Glu Ser Leu He Lys Arg Asp Tyr 
65 70 75 80 

Cys Ser Lys Tyr Lys Val Lys Glu Glu Met Gly Glu Leu His Phe Ser 
85 90 95 

Trp Glu Glu Lys Ser Leu He Val Ala Ser Ala Trp Ser Xaa 
100 105 110 

(2) INFORMATION FOR SEQ ID NO: 43s 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 137 amino acids 

( B) TYPE: amino acid 

(C) STRANDED NESS : single 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43: 

Asn Gly Ser Tyr Asn Ala Pro Phe Phe Val Thr Arg Phe Arg Glu Ala 
1 5 10 15 

Leu Phe His Tyr Ser Ala He Phe Asp Met Leu Glu Thr Asn He Pro 

20 25 30 

Lys Asp Asn Glu Gin Arg Leu Leu He Glu Ser Ala Leu Phe Ser Arg 
35 40 45 

Glu Xaa Asn Val He Ser Cys Glu Gly Leu Glu Arg Met Glu Arg Pro 
50 55 60 

Glu Thr Tyr Lys Gin Trp Gin Val Arg Asn Gin Arg Val Gly Phe Lys 
65 * 70 75 80 

Gin Leu Pro Leu Asn Gin Asp Met Met Lys Arg Ala Arg Xaa Glu Gly 
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85 90 95 

Gin Val Leu Pro Thr Arg Thr Phe lie He Asp Glu Asp Asn Arg Trp 
100 105 110 

Leu Leu Gin Gly Trp Lys Gly Arg He Leu Phe Ala Leu Ser Thr Trp 
115 120 125 



Lys Pro Asp Asn Arg Ser Ser Ser 

130 135 

(2) INFORMATION FOR SEQ ID NO* 44: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH i 41 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS t single 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44: 

Asn Gly Gly Ala Phe Ala Pro Ser Thr Trp Thr Ala Arg Ser Leu Asn 

X 5 10 15 

Gly Gly Ala Phe Ala Pro Ser Thr Trp Thr Ala Arg Ser Leu Pro Val 
20 25 30 

Pro Ser Ser Pro Ser Thr Asp Ser Phe 
35 40 

(2) INFORMATION FOR SEQ ID NO: 45: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1279 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDED NESS : unknown 
<D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45: 



GCGGCTATCT 


TCTACGGCCA 


CCACCACCAT 


ACACCTCCGC 


CGGCAAAGCG 


GCTCAACCCT 


60 


GGTCCCGTGG 


GGATAACAGA GCAGCTGGTT AAGGCAGCAG 


AGGTCATAGA 


GAGCGACACG 


120 


TGTCTAGCTC AGO GG AT ATT 


GGCGCGGCTC 


AATCAACAGC 


TCTCTTCTCC 


CGTCGGGAAG 


180 


CCATTAGAAA 


GAGCACCTTT 


TTACTTCAAA 


GAAGCTCTCA 


ATAATCTCCT 


TCACAACGTC 


240 


TCCCAAACCC 


TAAACCCTTA 


TTCCCTCATC 


TTCAAGATCG 


CTGCTTACAA 


ATCCTTCTCA 


300 


GAGATCTCTC 


CCGTTCTTCA 


GTTCGCCAAC 


TTTACCTCCA 


ACCAAOCCCT 


CTTAGAGTCC 


360 


TTCCATGCCT 


TCCACOGTCT 


CCACATCATC 


GACTTCGATA 


TCGGCTACGG 


TGGCCAATGG 


420 


GCTTCCCTCA 


TGCAAGAGCT 


TGTTCTCCGC 


GACAACGCCG 


CTCCTCTCTC 


CCTCAAGATC 


460 


ACCCTTTTCC 


CTTCTCCGGC 


GAACCACGAC 


CAGCTCGAAC 


TTGGCTTCAC 


TCAAGACAAC 


540 
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CTCAAGCACT 


TCGCCTCTGA 


GATCAACATC 


TCCCTTGACA 


TCCAAGTTTT 


GAGCTTAGAC 


600 


CTCCTCOOCT 


CCATCTCGTG 


GCCTAACTCG 


TCGGAGAAAG 


AAGCTGTCGC 


CGTTAACATC 


660 


TCCGCCCCGT 


CCTTCTCGCA 


CCTCCCTTTG 


GTCCTCCGTT 


TCGTGAAGCA 


TCTATCTCCG 


720 


ACG AT C ATCG 


TCTGCTCCGA 


CAGAGGATGC 


GAGAGGAGGG 


ATCTGCCCTT 


CTCTCAACAG 


780 


CTCGCCCACT 


CGCTGCACTC 


ACACACCGCT 


CTCTTCGAAT 


CCCTCGACGC 


CGTCAACGCC 


840 


AACCTCGACG 


CAATCCAGAA 


GATCGAGAGG 


TTTCTTATAC 


AGCCGGAGAT 


AGAGAAGCTG 


900 


GTGTTGGATC 


GTAGCCGTCC 


GATAGAAAGG 


CCGATGATGA 


CGTGGCAAGC 


GATGTTTCTA 


960 


CAGATGGGTT 


TCTCACCGGT 


GACGCACAGT 


AACTTCACGG 


AGTCTCAAGC 


CGAGTGTTTA 


1020 


GTCCAAOGGA 


CGCCAGTGAG 


AGGCTTTCAC 


GTCGAGAAGA 


AACATAACTC 


ACTTCTCCTA 


1080 


TCTTGGCAAA 


GGACAGAACT 


CGTCGGAGTT 


TCAGCATGGA 


GATGTCGCTC 


CTCCTGATTT 


1140 


CCACCGGAGT 


TTCAATTATT 


AAAAAAATAT 


TTTCCTTAAT 


TCAATTTATC 


TTAAATGACA 


1200 


AATTTTTAGT 


TTCTGATTTT 


ATTTTGCTCA 


GTGCGATGGA 


TTTTTAAATT 


TAAGTTTCAC 


1260 


ACAAATATAT 


AAATTTTTG 










1279 


(2) INFORMATION FOR SEQ ID NO: 46: 









(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 379 amino acida 

(B) TYPE: amino acid 

(C) STRANDEDNESS: e ingle 

(D ) TOPOLOGY : unknown 

(ii) MOLECULE TYPE: peptide 



(xl) SEQUENCE DESCRIPTION: SEQ ID NO: 46: 

Ala Ala He Phe Tyr Gly Hie Hia Hia Hia Thr Pro Pro Pro Ala Lyo 

1 5 10 15 

Ara Leu Aan Pro Gly Pro Val Gly He Thr Glu Gin Leu Val Lya Ala 
20 25 30 

Ala Glu Val He Glu Ser Asp Thr Cys Leu Ala Gin Gly He Leu Ala 
35 40 45 

Arg Leu Asn Gin Gin Leu Ser Ser Pro Val Gly Lye Pro Leu Glu Arg 
50 55 60 

Ala Ala Phe Tyr Phe Lya Glu Ala Leu Aan Aan Leu Leu Hia Aan Val 

65 70 75 SO 

Ser Gin Thr Leu Aan Pro Tyr Ser Leu He Phe Lya He Ala Ala Tyr 

85 90 95 

Lva Ser Phe Ser Glu He Ser Pro Val Leu Gin Phe Ala Aan Phe Thr 
J 100 105 HO 

Ser Asn Gin Ala Leu Leu Glu Ser Phe Hie Gly Phe Hia Arg Leu Hia 

115 120 125 

He He Aap Phe Aap He Gly Tyr Gly Gly Gin Trp Ala Ser Leu Met 
130 135 140 
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Gin Glu Leu Val Leu Arg Asp Aen Ala Ala Pro Leu Ser Leu Lye lie 

145 150 155 160 

Thr Val Phe Ala Ser Pro Ala Aan Hie Aap Gin Leu Glu Leu Gly Phe 
165 170 175 

Thr Gin Asp Asn Leu Lya Hie Phe Ala Ser Glu He Aen He Ser Leu 

180 185 190 

Aap He Gin Val Leu Ser Leu Asp Leu Leu Gly Ser He Ser Trp Pro 
195 200 205 

Aen Ser Ser Glu Lye Glu Ala Val Ala Val Aen He Ser Ala Ala Ser 

210 215 220 

Phe Ser Hie Leu Pro Leu Val Leu Arg Phe Val Lya Hia Leu Ser Pro 

225 230 235 240 

Thr He He Val Cya Ser Aap Arg Gly Cya Glu Arg Thr Aap Leu Pro 
245 250 255 

Phe Ser Gin Gin Leu Ala Hia Ser Leu Hia Ser Hia Thr Ala Leu Phe 

260 265 270 

Glu Ser Leu Aap Ala Val Aan Ala Aan Leu Aap Ala Met Gin Lya He 

275 280 285 

Glu Arg Phe Leu He Gin Pro Glu He Glu Lya Leu Val Leu Aap Aro 
290 295 300 

Ser Arg Pro He Glu Arg Pro Met Met Thr Trp Gin Ala Met Phe Leu 
305 310 315 320 

Gin Met Gly Phe Ser Pro Val Thr Hia Ser Aan Phe Thr Glu Ser Gin 

325 330 335 

Ala Glu Cya Leu Val Gin Arg Thr Pro Val Arg Gly Phe Hia Val Glu 
340 345 350 

Lya Lya Hia Aan Ser Leu Leu Leu Cya Trp Gin Arg Thr Glu Leu Val 

355 360 365 

Gly Val Ser Ala Trp Arg Cya Arg Ser Ser Xaa 

370 375 

(2) INFORMATION FOR SEQ ID NO: 47: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH i 745 baae pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNE SS : unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: cONA 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47: 
TGCATACAAC GCACCGTTTT TCGTAACACG GTTTCGCGAA GCTCTATTTC ATTTCTCCTC 
GATTTTTGAC ATGCTTGAGA CAATTGTGCC ACGAGAAGAC GAAGAGAGGA TGTTCCTTGA 
GATGGAGGTC TTTGGGAGAC AGGCACTGAA TGTGATTGCT TGCGAAGGTT GGGAAAGAGT 
GGAGAGGCCT GAGACATACA AGCAGTGGCA CGTACGGGCT ATGAGGTCAG GCTTGGTGCA 
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GGTTCCATTT GACCCAAGCA TTATGAAGAC ATCGCTGCAT AAGOTCCACA CATTCTACCA 300 

CAAGGATTTT GTGATCOATC AAGATAACCG GTGGCTCTTG CAAGGCTGGA AGGGAAGAAC 360 

TGTCATGCCT CTTTCTGTTT GGAAACCAGA GTCCAAGGCT TGACCGAGAA ATCCTCGTTG 420 

GCATATGAGA GACCATCTCT TGATTTTCTT CCTGTGTAAT TCCCAGAGAC AGAATTACAG 480 

ATGTAAGAAG AGAATGCTGC ACAAAGAACT TGTTGAAAGA TAATATTGAT GTAAGTCCTG 540 

TTTTATAACT TTCTAGCTGT GTTTTTGTTG TTTCTCAGCT AGATTCTCCT AACGGTATTC 600 

TTGTAGCTAG GGTGATCAGA TTGTTTGTAT ATTGCT AG CA GAGTTAGTTT GTCTAGATTG 660 

TAACACATAT AAGAGGAAGC TTAGAGTTTC TATGGTTTAA AGAGAAGTTT TTTCCTTCTC 720 

CAATGTAAAA AAAAAAAAAA AAAAA 745 
(2) INFORMATION FOR SEQ ID NOi48: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 134 amino acids 

(B) TYPE: amino acid 

<C) STRANDED NESS : single 
(D) TOPOLOGY: unknown 

<ii) MOLECULE TYPE: peptide 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48: 

Ala Tyr Asn Ala Pro Phe Phe Val Thr Arg Phe Arg Glu Ala Leu Phe 
15 10 15 

His Phe Ser Ser lie Phe Asp Met Leu Glu Thr lie Val Pro Arg Glu 

20 25 30 

Asp Glu Glu Arg Met Phe Leu Glu Met Glu Val Phe Gly Arg Glu Ala 
35 40 45 

Leu Asn Val lie Ala Cys Glu Gly Trp Glu Arg Val Glu Arg Pro Glu 
50 55 SO 

Thr Tyr Lys Gin Trp His Val Arg Ala Met Arg Ser Gly Leu Val Gin 

65 70 75 80 

Val Pro Phe Asp Pro Ser lie Met Lys Thr Ser Leu His Lys Val His 
85 90 95 

Thr Phe Tyr His Lys Asp Phe Val lie Asp Gin Asp Asn Arg Trp Leu 

100 105 110 

Leu Gin Gly Trp Lys Gly Arg Thr Val Met Ala Leu Ser Val Trp Lys 
111 120 125 

Pro Glu Ser Lys Ala Xaa 

130 

(2) INFORMATION FOR SEQ ID NO: 49: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 775 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49: 

AAAAAATGGG AAACCATGAC TCTTGATGAA CTTATGATCA ATCCAGGAGA GACAACGGTC 60 

G T CAACTGC A TTCATCGGTT ACAATACACT CCTGATGAAA CTGTGTCATT AGACTCTCCA 120 

AGAGACACGG TTCTGAAGCT ATTCAGAGAT ATCAATCCTG ACCTCTTTGT GTTTGCAGAG 180 

ATTAACGGAA TGTACAACTC TCCTTTCTTC ATGACGAGGT TCCGAGAAGC GCTTTTTCAT 240 

TACTCTTCAC TCTTTGACAT GTTTGACACC ACAATACACG CAGAGGATGA GTACAAAAAC 300 

AGGTCACTGT TGGAGAGAGA GTTACTTGTG AGAGACGCGA TGAGCGTGAT TTCCTGCGAG 360 

GGTGCAGAGC GGTTTGCGAG GCCTGAAACC TACAAGCAAT GGCGAGTTAG GATTTTGAGA 420 

GCCGGGTTTA AGCCAGCAAC TATTAGCAAA CAGATCATGA AGGAGGCTAA GGAAATTGTG 480 

AGGAAACGTT ACCATAGAGA TTTTGTGATC GATAGCGATA ACAATTGGAT GCTTCAAGGA 540 

TGGAAAGGAA GAGTCATCTA TGCTTTTTCT TGCTGGAAAC CTGCTGAGAA GTTCACAAAC 600 

AATAATTTAA ACATCTGAAA AATGTTACTT CTCAATTACA TCATTTTTGT TTCCCAATGG 660 

TTTTGTAGAA TATGTTTGAT CCCGTGAGTG GATGCAACTC TTTTTTCCTG CAAGTACATA 720 

TTGTATTCAA ATCCTTGTGG AAATGATAAA TTGTTTAATC AAAAAAAAAA AAAAA 775 
(2) INFORMATION FOR SEQ ID NO* 50: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 206 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: peptide 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50: 

Lys Lys Trp Glu Thr lie Thr Leu Asp Glu Leu Met lie Asn Pro Gly 
1 5 io 



15 



Glu Thr Thr Val Val Asn Cys He His Arg Leu Gin Tyr Thr Pro Asp 

20 25 30 

Glu Thr Val Ser Leu Asp Ser Pro Arg Asp Thr Val Leu Lys Leu Phe 
35 40 45 

Arg Asp He Asn Pro Asp Leu Phe Val Phe Ala Glu He Asn Gly Met 

b ° 55 60 

Tyr Asn Ser Pro Phe Phe Met Thr Arg Phe Arg Glu Ala Leu Phe His 
65 70 75 80 

Tyr Ser Ser Leu Phe Asp Met Phe Asp Thr Thr He His Ala Glu Asp 
85 90 95 

Glu Tyr Lys Asn Arg Ser Leu Leu Glu Arg Glu Leu Leu Val Arg Asp 
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100 105 110 

Ala Met Ser Val lie Ser Cyo Glu Cly Ala Glu Arg Phe Ala Arg Pro 
115 120 125 

Glu Thr Tyr Lys Gin Trp Arg Val Arg lie Leu Arg Ala Gly Phe Lye 
130 135 140 

Pro Ala Thr He Ser Lye Gin He Met Lye Glu Ala Lye Glu He Val 
145 150 155 160 

Arg Lye Arg Tyr Hie Arg Aep Phe Val He Asp Ser Aep Aen Aen Trp 

165 170 175 

Met Leu Gin Gly Trp Lye Gly Arg Val He Tyr Ala Phe Ser Cya Trp 
180 185 190 

Lya Pro Ala Glu Lye Phe Thr Aen Asn Aen Leu Aen He Xaa 

195 200 205 

(2) INFORMATION FOR SEQ ID NO: 51: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 548 baee pairs 

(B) TYPE: nucleic acid 

(C) STRANDSONBSS: unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: CDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:51: 

AATCGCTTGA ACCGAATTTG GATCGAGATT CGAAAGAAAG GCTGAGAGTG GAGAGAGTGC 60 

TGTTCGGTAG GAGGATTATG GATTTGGTCC GATCAGATGA TGATAATAAT AAACCGGGAA 120 

CCCGGTTTGG GTTAATGGAG GAGAAAGAAC AATGGAGAGT GTTGATGGAG AAAGCTGGAT 180 

TTGAGCCGGT TAAACCGAGT AATTACGCGG TTAGCCAAGC GAAGCTGCTA CTATGGAACT 240 

ACAATTATAG TACATTGTAT TCACTTGTTG AATCGGAGCC AGGTTTCATC TCCTTGGCTT 300 

GGAACAATGT GCCTCTCCTC ACCGTTTCCT CTTGGCGTTG ACTACTTGGT CCGATAAGTT 360 

AATCTAGTAT TTTGAGTTAG CTTTTAGAAT TGAATTGTTT GGGGTTAGAT TTGGATGTTT 420 

AATTAGTCTC TAGCCTATTC TCTTACTCTT TTTTGTCTAG TGCTTGGAGT GATGATGGTT 480 

TGTCGTTTAT GTTCATTTGT AATATATATT GTATGTAACA TTTGACTAAA AAAAAAAAAA 540 
AAAAAAAA 

(2) INFORMATION FOR SEQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 113 amino acids 

(B) TYPE: amino acid 

(C) STRANDED NESS : a ingle 

(D) TOPOLOGY: unknown 

<ii) MOLECULE TYPE: peptide 



548 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52: 

Ser Leu Glu Pro Asn Leu Asp Arg Asp Ser Lya Glu Arg Leu Arg Val 
15 10 15 

Glu Arg Val Leu Phe Gly Arg Arg lie Met Asp Leu Val Arg Ser Asp 

20 25 30 

Aep Aap Aan Aan Lya Pro Gly Thr Arg Phe Gly Leu Met Glu Glu Lya 
35 40 45 

Glu Gin Trp Arg Val Leu Met Glu Lya Ala Gly Phe Glu Pro Val Lya 
50 55 60 

Pro Ser Aan Tyr Ala Val Ser Gin Ala Lya Leu Leu Leu Trp Aan Tyr 
65 70 75 80 

Asn Tyr Ser Thr Leu Tyr Ser Leu Val Glu Ser Glu Pro Gly Phe lie 
85 90 95 

Ser Leu Ala Trp Aan Asn Val Pro Leu Leu Thr Val Ser Ser Trp Arg 
100 105 no 

Xaa 



(2) INFORMATION FOR SEQ ID NO: 53: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1093 base pairs 

(B) TYPE: nucleic acid 

(C) ST HANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: 



GCGAATGTTG 


AGATCTTGCA 


AGCAATAGCT 


GGGGAAACCA 


GAGTCCACAT 


TATCGATTTT 


60 


CAGATTGCAC 


AGGGATCACA 


ATACATGTTT 


TTGATTCAGG 


AGCTTGCGAA 


ACGCCCTGGT 


120 


GGGCCGCCGT 


TGCTGCGTGT 


GACGGGTGTG 


GATGATTCAC 


AGTCCACCTA 


TGCTCGTGGG 


180 


GGAGGACTCA 


GCTTGGTAGG 


TGAGAGGCTT 


GCAACTTTGG 


CGCAGTCATG 


TGGTGTCCCG 


240 


TTTGAGTTTC 


ACGATGCCAT 


CATGTCTGGG 


TGCAAGGTGC 


AGCGGGAACA 


TCTCGGGTTG 


300 


GAACCTGCCT 


TTGCTGTTGT 


TGTGAACTTC 


CCATATGTAT 


TACACCACAT 


GCCAGACGAG 


360 


AGCGTAAGTG 


TTGAAAAATA 


CAGAGACAGG 


CTGCTGCATC 


TGATCAAGAG 


CCTCTCCCCA 


420 


AAACTGGTTA 


CTCTAGTAGA 


GCAAGAATCC 


AACACAAACA 


CCTCGCCATT 


GGTGTCACGG 


480 


TTTGTGGAAA 


CACTGGATTA 


CTACACAGCG 


ATGTTTGAGT 


CGATAGATGC 


AGCACGGCCA 


540 


CGGGATGATA 


AGCAGAGAAT 


CAGCGCAGAA 


CAACACTGTG 


TAGCAAGAGA 


CATAGTGAAC 


600 


ATGATAGCAT 


GTGAGGAGTC 


AGAGAGAGTA 


GAGAGACACG 


AGGTACTGGG 


GAAATGGAGG 


660 


GTCAGAATGA 


TGATGGCTGG 


GTTCACGGGT 


TGGCCGGTCA 


GCACATCTGC 


AGOGTTTGCA 


720 


GCGAGTGAGA 


TGCTGAAAGC 


TTATGACAAA 


AACTACAAAC 


TGGGAGGCCA 


TGAAGGAGCG 


780 
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CTCTACCTCT TCTGGAAGAG ACGACCCATG GCTACATGTT CCGTGTGGAA GCGAAACCGA 840 

AACTATATTG GGTAAGTTAT AGTGATGATG GTTACTTGAG TGGATAAAGA AGAGCACAAC 900 

AAAAACACAT CTGTCGCTGT AAATTTTTTA GGATGTGCAA TGATGTTTTA AGTTGTAACA 960 

CAACCTAAGT TATATATGTA TACAAACCAA ACCTGGTGGT TOTTTTTCTC TTGTAAATTG 1020 

TCATGTGGTT GTGGGTGGGA AGCTAGTAAT GAAATATAAC CAAAACATTG ATTAGGTCAA 1080 

AAAAAAAAAA AAA 1093 

(2) INFORMATION FOR SEQ ID NO: 54: 

<i> SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 285 amino acids 
<B) TYPE: amino acid 
(C> STRANDSDNESS: single 
(D) TOPOLOGY: unknown 

(li) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: 

Ala Ann Val Glu He Leu Glu Ala He Ala Gly Glu Thr Arg Val His 
15 10 15 

lie lie Aap Phe Gin lie Ala Gin Gly Ser Gin Tyr Met Phe Leu lie 
20 25 30 

Gin Glu Leu Ala Lya Arg Pro Gly Gly Pro Pro Leu Leu Arg Val Thr 
35 40 45 

Gly Val Aap Aep Ser Gin Ser Thr Tyr Ala Arg Gly Gly Gly Leu Ser 
50 55 60 

Leu Val Gly Glu Arg Leu Ala Thr Leu Ala Gin Ser Cya Gly Val Pro 
65 70 75 80 

Phe Glu Phe Hia Aap Ala lie Met Ser Gly Cys Lys Val Gin Arg Glu 
85 90 95 

Hia Leu Gly Leu Glu Pro Gly Phe Ala Val Val Val Asn Phe Pro Tyr 
100 105 HO 

Val Leu Hia Hia Met Pro Aap Glu Ser Val Ser Val Glu Lye Tyr Arg 
115 120 125 

Aap Arg Leu Leu His Leu He Lys Ser Leu Ser Pro Lys Leu Val Thr 

130 135 140 

Leu Val Glu Gin Glu Ser Asn Thr Asn Thr Ser Pro Leu Val Ser Arg 
145 150 155 160 

Phe Val Glu Thr Leu Asp Tyr Tyr Thr Ala Met Phe Glu Ser He Asp 

165 170 175 

Ala Ala Arg Pro Arg Asp Asp Lys Gin Arg He Ser Ala Glu Gin His 
180 185 190 

Cys Val Ala Arg Asp He Val Aan Met He Ala Cys Glu Glu Ser Glu 

195 ~ 200 205 

Arg Val Glu Arg His Glu Val Leu Gly Lys Trp Arg Val Arg Met Met 
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210 215 220 

Met Ala Gly Phe Thr Gly Trp Pro Val Ser Thr Ser Ala Ala Pha Ala 
225 230 235 240 

Ala Ser Glu Mat Leu Lya Ala Tyr Aap Lya Aan Tyr Lya Leu Gly Glv 
245 250 255 

Hia Glu Gly Ala Leu Tyr Leu Phe Trp Lya Arg Arg Pro Met Ala Thr 

260 265 270 

Cya Ser Val Trp Lya Pro Aan Pro Aan Tyr lie Gly Xaa 
275 280 285 

(2) INFORMATION FOR SBQ ID NOt55: 

<i) SEQUENCE CHARACTERISTICS t 

(A) LENGTH: 1928 baae pairs 

(B) TYPE i nucleic acid 

(C) STRAND ED NESS : unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55: 



AAAGACTTTA GCAGATTTTC AAGCGGCTCA 


GAACATCAAC AACAACAACA ACAACAACCG 


60 


TTTTATAGTC AAGCAGCTCT CAACGCTTTT 


CTTTCAAGGT 


CTGTGAAGCC 


TCGAAATTAT 


120 


CAGAATTTTC AATCTCCGTC GGCCGATGAT 


TGATCTCACG 


TCGGTGAATG 


ATATGAGTTT 


180 


GTTTGGTGGT TCTGGTTCAT CTCAGCGTTA 


OGGTTTACCG 


GTTCCCAGGT 


CTCAGACGCA 


240 


ACAGCAACAA TCGGATTACG GTTTATTTGG 


TGGGATCCGA 


ATGGGAATCG 


GGTCGGGTAT 


300 


TAATAATTAT CCAACATTAA CCGGCGTTCC 


GTGTATTGAA 


CCGGTTCAAA 


ACCGGGTTCA 


360 


TGAATCGCAG AACATGTTGA ATAGTTTAAG 


AGAGCTTGAG 


AAACAGCTTT 


TAGATGATGA 


420 


CGATGAGAGT GGTGGTGATG ATGACGTGTC 


AGTTATAACA 


AATTCAAATT 


CCGATTGGAT 


480 


TCAAAATCTC GTGACTCCGA ACCCGAACCC 


GAACCCGGTT 


TTGTCTTTTT 


CACCGAGCTC 


540 


TTCTTCTTCG TCTTCTTCCC CTTCTACAGC 


TTCGACGACG 


ACATCGGTAT 


GTTCTAGGCA 


600 


AACGGTTATG GAAATCGCGA CGGCGATCGC 


GGAAGGGAAA 


ACAGAGATAG 


CGACGGAGAT 


660 


TTTGGCGCGT GTTTCTCAAA CGCCTAATCT 


TGAGAGGAAT 


TCAGAGGAGA 


AGCTTGTTCA 


720 


TTTCATGGTG GCTGCGCTTC GATCGAGGAT 


AGCTTCTCCA 


GTGACGGAAT 


TGTATGGGAA 


780 


GGAGCATTTA ATCTOGACTC AATTGCTCTA 


CGAGCTCTCT 


CCTTGTTTCA 


AACTCGGTTT 


840 


CGAGGCCGCG AATCTCGCCA TTCTCOACCC 


CGCCGATAAC 


AACGACGGTG 


GAATGATGAT 


900 


ACCGCACGTT ATCGATTTCG ATATCGGAGA 


AGGTGGACAA 


TACGTTAACC 


TTCTCCGTAC 


960 


ATTATCCACG CGCCGGAATG 0TAAAAGTCA 


GAGTCAGAAT 


TCTCCGGTGG 


TTAAGATCAC 


1020 


CGCCGTGGCG AACAACGTTT ACGGATGTTT 


AGTCGATGAC 


GGTGGAGAAG 


AGAGGTTAAA 


1080 


AGCCGTCGGA GATTTGTTGA GCCAACTCGG 


TGATCGACTC 


GGTATCTCCG 


TAAGTTTCAA 


1140 
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CGTGGTGACG 


AGTTTACGAC 


TCGGTGATCT 


G AATCGTG AA 


TCTCTCGGGT 


GTGATCCCGA 


1200 


CGAGACTTTG 


GCTGTGAACT 


TAGCTTTCAA 


GCTTTATCGT 


GTTCCCGACG 


AAAGOGTATG 


1260 


CACGGAGAAT 


CCAAGAGAGG 


AACTTCTCCG 


GCGCGTGAAG 


GGACTTAAAC 


CGCGCGTGGT 


1320 


TACTCTAGTG 


GAGCAAGAAA 


TGAATTCGAA 


TACGGCGCOG 


TTTTTAGGGA 


GAGTGAGTGA 


1380 


GTCATGCGCG 


TGTTACGGTG 


CGTTGCTTGA 


GTOGGTOGAG 


TCTACGGTTC 


CTAGTACGAA 


1440 


TTCCGACCGT 


GCCAAAGTTG 


AGGAAGGAAT 


TGGCCGGAAG 


CTAGTAAACG 


CGGTGGCGTG 


1500 


CGAAGGAATC 


GATCGTATAG 


AGCGGTGCGA 


GGTGTTCGGG 


AAATGGCGAA 


TGCGGATGAG 


1560 


CATGGCTGGG 


TTTGAGTTAA 


TGCCATTGAG 


TGAGAAGATA 


GCGGAG TCG A 


TGAAGAGTCG 


1620 


TGGAAACCGA 


GTCCACCCGG 


GCTTTACCGT 


TAAAGAAGAT 


AACGGAGGTG 


TGTGCTTTGG 


1680 


TTGGATGGGA 


CGGGCACTCA 


CTGTCGCATC 


CGCTTGGCGT 


TAACTTCACA 




1740 


TTTCTTCTTA 


TTATTACCAT 


ATTATTATTA 


ATTTTCGAGA 


TTATTCTGAT 


ATTATTATCA 


1800 


TTGTGATTTT 


CCGTTTCGAA 


AAGTGTAGGA 


ATCTTATGTA 


ACAAAGAAAA 


AAAAAAGACT 


1860 


TTTATGTTTT 


TCTAATAATA 


AAAGAAAGAG 


TGATTGGGTT 


CAAAAAAAAA 


AAAAAAAAAA 


1920 


AAAAAAAA 












1928 


(2) INFORMATION FOR SEQ ID NO: 56s 









(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 524 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : unknown 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56: 

A 

1 



Asp Leu Thr Ser Val Asn Asp Met Ser Leu Phe Gly Gly Ser Gly Ser 

5 10 15 



Ser Gin Arg Tyr Gly Leu Pro Val Pro Arg Ser Gin Thr Gin Gin Gin 
20 25 30 

Gin Ser Asp Tyr Gly Leu Phe Gly Gly He Arg Met Gly He Gly Ser 
35 40 45 

Gly He Asn Asn Tyr Pro Thr Leu Thr Gly Val Pro Cys He Glu Pro 
50 55 60 



Val Gin Asn Arg Val His Glu Ser Glu Asn Met Leu Asn Ser Leu Arg 

65 70 75 80 



Glu Leu Glu Lys Gin Leu Leu Asp Asp Asp Asp Glu Ser Gly Gly Asp 

90 95 



85 



Asp Asp Val Ser Val He Thr Asn Ser Asn Sar Asp Trp He Gin Asn 



100 



105 



Leu Val Thr Pro Asn Pro Asn Pro Asn Pro Val Leu Ser Phe Ser Pro 
115 120 125 
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Ser Ser Ser Ser Ser Ser Ser Ser Pro Ser Thr Ala Ser Thr Thr Thr 

130 135 140 

Sar Val Cya Ser Arg Gin Thr Val Met Glu lie Ala Thr Ala lie Ala 
145 150 155 160 

Glu Gly Lya Thr Glu lie Ala Thr Glu lie Leu Ala Arg Val Ser Gin 
165 170 175 

Thr Pro Aan Leu Glu Arg Aan Ser Glu Glu Lya Leu Val Aap Phe Met: 
180 185 190 

Val Ala Ala Leu Arg Ser Arg lie Ala Ser Pro Val Thr Glu Leu Tyr 
195 200 205 

Gly Lya Glu Hie Leu He Ser Thr Gin Leu Leu Tyr Glu Leu Ser Pro 
210 215 220 

Cya Phe Lya Leu Gly Phe Glu Ala Ala Aan Leu Ala He Leu Aap Ala 

225 230 235 240 

Ala Aap Aan Aan Aap Gly Gly Met Met He Pro Hia Val He Aap Phe 

245 250 255 

Aap He Gly Glu Gly Gly Gin Tyr Val Aan Leu Leu Arg Thr Leu Ser 
260 265 270 

Thr Arg Arg Aan Gly Lya Ser Gin Ser Gin Aan Ser Pro Val Val Lya 
275 280 285 

He Thr Ala Val Ala Aan Aan Val Tyr Gly Cya Leu Val Aap Asp Glv 
290 295 300 

Gly Glu Glu Arg Leu Lya Ala Val Gly Aap Leu Leu Ser Gin Leu Gly 
305 310 315 320 

Aap Arg Leu Gly He Ser Val Ser Phe Aan Val Val Thr Ser Leu Arg 
325 330 335 

Leu Gly Aap Leu Aan Arg Glu Ser Leu Gly Cya Aap Pro Aap Glu Thr 

340 345 350 

Leu Ala Val Aan Leu Ala Phe Lya Leu Tyr Arg Val Pro Aap Glu Ser 
355 360 365 

Val Cya Thr Glu Aan Pro Arg Aap Glu Leu Leu Arg Arg Val Lya Gly 
370 375 380 

Leu Lya Pro Arg Val Val Thr Leu Val Glu Gin Glu Met Aan Ser Aan 
385 390 395 40Q 

Thr Ala Pro Phe Leu Gly Arg Val Ser Glu Ser Cya Ala Cya Tyr Gly 
«05 4io 4X5 

Ala Leu Leu Glu Ser Val Glu Ser Thr Val Pro Ser Thr Aan Ser Aap 
420 425 430 

Arg Ala Lya Val Glu Glu Gly He Gly Arg Lya Leu Val Aan Ala Val 
435 440 445 

Ala Glu Gly Ile Asp Arg Ila Glu Ar 9 c Y fl <51" Val Phe Gly Lya 

450 455 460 

Trp Arg Met Arg Met Ser Met Ala Gly Phe Glu Leu Met Pro Leu Ser 
465 <70 475 480 

Glu Lya lie Ala Glu Ser Met Lya Ser Arg Gly Aan Arg Val Hia Pro 
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485 490 495 

Cly Phe Thr Val Lya Glu Asp Asn Cly Gly Val Cye Phe Gly Trp Mat 
500 505 510 

Gly Arg Ala Lau Thr Val Ala Ser Ala Trp Arg Xaa 
515 520 

(2) INFORMATION FOR SSQ ID NO* 57: 

(i) SEQUENCE CHARACTERISTICS t 

(A) LENGTH t 2635 baao pairs 

(B) TYPE i nuclalc acid 

(C) STRANDBONESS : unknown 

(D) TOPOLOGY: unknown 

(11) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57: 

TCTTACTCAA GGTTCTTCTT TGTCATCTTG TTGCCGAATC CACAAAGAGG AGAATAAAGA 60 

TTCGACCTTT ATTAGATATT AACGACTCTG GATTTTTGGG TTTTTGGAGT TGGATCCACA 120 

TGGGTTCTTA TCCGG ATGGA TTCCCTGGAT CCATGGACGA GTTGGATTTC AATAAGGACT 180 

TTGATTTGCC TCCCTCCTCA AACCAAACCT TAGGTTTAGC TAATGGGTTC TATTTAGATG 240 

ACTTAGATTT CTCATCCTTG GATCCTCCAG AGGCATATCC CTCCCAGAAC AACAACAACA 300 

ACAACATCAA CAACAAAGCT GTAGCAGGAG ATCTGTTATC ATCTTCATCT GATGACGCTG 360 

ATTTCTCTGA TTCTGTTTTG AAGTATATAA GCCAAGTTCT TATGGAAGAG GATATGGAAG 420 

AGAAGCCTTG TATGTTTCAT GATGCTTTGG CTCTTCAAGC TGCTGAGAAA TCTCTCTATG 480 

AGGCTCTTGG TGAGAAAGAC CCTTCTTCGT CTTCTGCTTC TTCTGTGGAT CATCCTGAGA 540 

GATTGGCTAG TCATAGCCCT GACGGTTCTT GTTCAGGTGG TGCTTTTAGT GATTACGCTA 600 

GCACCACTAC CACTACTTCC TCTGATTCTC ACTGGAGTGT TGATGGTTTG GAGAATAGAC 660 

CTTCTTGGTT ACATACACCT ATGCCGAGTA ATTTTGTTTT CCAGTCTACT TCTAGGTCCA 720 

ACAGTGTCAC CGGTGGTCGT GGTGGTGGTA ATAGTGCGGT TTACGGTTCA GGTTTTGGCG 780 

ATGATTTGGT TTCGAATATG TTTAAAGATG ATGAATTGGC TATGCAGTTC AAGAAAGGGG 840 

TTGAGGAAGC TAGTAAGTTC CTTCCTAAGT CTTCTCAGCT CTTTATTGAT GTGGATAGTT 900 

ACATCCCTAT GAATTCTGGT TCCAAGGAAA ATGGTTCTGA GGTTTTTGTT AAGACGGAGA 960 

AGAAAGATGA GACAGAGCAT CATCATCATC ATAGCTATGC ACCACCACCC AACAGATTAA 1020 

CTGGTAAGAA AAGCCATTGG CGCGACGAAG ATGAAGATTT CGTTGAAGAA AGAAGTAACA 1080 

AGCAATCAGC TGTTTATGTT GAGGAAAGCG AGCTTTCTGA AATGTTTGAT AACATGTTCC 1140 

TATGTGGCCC TGGGAAACCT GTATGCATTC TTAACCAGAA CTTTCCTACA GAATCCGCTA 1200 

AAGTCGTGAC CGCACAGTCA AATGGAGCAA AGATTCGTGG GAAGAAATCA ACTTCTACTA 1260 

GTCATAGTAA CGATTCTAAG AAAGAAACTG CTGATTTGAG GACTCTTTTG GTGTTATGTG 1320 
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CACAAGCTGT ATCAGTOGAT GATCGTAGAA CCGCCAACGT 
AGCATTCTTC GCCTCTAGGC AATGGTTCAG AGCGGTTGGC 
TTGAAGCA CG CTTAGCTGCG ACOGGTACAC AGATCTACAC 
CGTCTGCAGC AGACATGTTG AAGGCTTACC AGACATACAT 
AAG CTGCT AT CATATTTGCT AACCACAGCA TGATGCGTTT 
TCCACATAAT AGATTTCGGA ATATCTTACG GTTTTCAGTG 
TCTCGCTCAG CAGACCTGGT GGTTCGCCTA AGCTTCGAAT 
NNNNNNNNNN NNNNNNNNNN KNNGAGTTCA GGAGACAGGT 
TCAGCGACAC AATGTTCCGT TTGAGTACAA CGCAATTGCT 
AAGTCGAAGA CTTAAAGCTT CGACAAGGAG AGTATGTGGT 
TCAGGAACCT TCTAGATGAG ACCGTTCTGG TAAACAGCCC 
TGATAAGAAA AATAAACCCG AATGTCTTCA TTCCAGCGAT 
CGCCATTCTT TGTCACGAGG TTCAGAGAAG CGTTGTTTCA 
TGTGTGACTC GAAGCTAGCT AGGGAAGACG AGATGAGGCT 
ATGGGAGAGA GATTGTGAAT GTTGTGGCTT CTGAAGGAAC 
AGACATATAA GCAGTGG CAG G CG AG AC TG A TCCGAGCCGG 
AGAAGGAACT GATGCAGAAT CTGAAGTTGA AAATCGAAAA 
ATGTTGATCA AAACGGTAAC TGGTTACTTC AAGGGTGGAA 
CATCTCTATG GGTTCCTTCG TCTTCATAGA TGTTGTTTCT 
TTTATGTAGG GCTTTTCTGT TGATAGTCTC TCGCCAACAC 
TAGGGTTCTT GAACACTAGA ATGTTGTTAT ATTATGCTTG 
TGTAGCCTAA GAGATATAGT ACTCATTCCA TGATCTTTTG 
(2) INFORMATION FOR SEQ ID NO: 58: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 809 amino acids 

(B) TYPE: amino acid 

(C) STRANDED NESS : single 
<D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: peptide 



TTAGCTAAGG 
TCATTATTTT 
CGCTTTATCT 
GTCGGTCTGC 
CACTGCAAAC 
GCCTGCTCTG 
TACCGGTNNN 
CATCGCTTGG 
CAGAAATGGG 
TGTGAACTCT 
GAGAGATGCA 
CTTAAGCGGG 
TTACTCGGCT 
GATGTATGTG 
AGAGAGAGTG 
ATTTAGACAG 
CGGGTACGAT 
AGGTAGAATC 
TACGTTCTAA 
GAGTGGATTA 
TGACATAGCG 
CTATATGTTN 



CAGATACGAG 
GCAAATAGTC 
TCGAAGAAAA 
CCTTTCAAGA 
GCCAACACGA 
ATTCATOCCC 
NNNNNNNNNN 
CTCGATACTG 
GAAACGATCC 
TTGTTCCGTT 
GTTTTGAAGC 
AATTACAACG 
GTGTTTGATA 
TTTGAGTTTT 
GAGAGCCGAG 
CTTCOGCTTG 
AAAAACTTCG 
GTGTATGCTT 
GCGACTGGGA 
AGTTCAGAGT 
TGTGTAAGAG 
CATGT 



1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2635 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58: 

Leu Leu Lye Val Leu Leu Cye Hie Leu Val Ala Glu ser Thr Lye Arg 

1 5 10 15 

Arg lie Lye lie Arg Pro Leu Leu Aap He Asn Asp Ser Gly Phe Leu 

20 25 30 

Gly Phe Trp Ser Trp He His Met Gly Ser Tyr Pro Asp Gly Phe Pro 
35 40 45 
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Gly Ser Met Asp Glu Leu Aep Phe Aan Lys Asp Phe Asp Leu Pro Pro 

50 55 60 

Ser Ser Aan Gin Thr Leu Gly Leu Ala Aan Gly Phe Tyr Leu Aap Aap 
65 70 75 80 

Leu Asp Phe Ser Ser Leu Aap Pro Pro Glu Ala Tyr Pro Ser Gin Aan 

85 90 95 

Aan Aan Aan Aan Aan lie Aan Aan Lya Ala Val Ala Gly Aap Leu Leu 

100 105 110 

Ser Ser Ser Ser Aap Aap Ala Aap Phe Ser Aap Ser Val Leu Lya Tyr 

115 120 125 

lie Ser Gin Val Leu Met Glu Glu Aap Met Glu Glu Lya Pro Cya Met 
130 135 140 

Phe Hia Asp Ala Leu Ala Leu Gin Ala Ala Glu Lya Ser Leu Tyr Glu 

145 150 155 160 

Ala Leu Gly Glu Lya Aap Pro Ser Ser Ser Ser Ala Ser Ser Val Asp 
165 170 175 

Hia Pro Glu Arg Leu Ala Ser Hia Ser Pro Aap Gly Ser Cya Ser Gly 
180 185 190 

Gly Ala Phe Ser Aep Tyr Ala Ser Thr Thr Thr Thr Thr Ser Ser Aap 
195 200 205 

Ser Hia Trp Ser Val Asp Gly Leu Glu Aan Arg Pro Ser Trp Leu Hie 
210 215 220 

Thr Pro Met Pro Ser Aan Phe Val Phe Gin Ser Thr Ser Arg Ser Asn 
225 230 235 240 

Ser Val Thr Gly Gly Gly Gly Gly Gly Aan Ser Ala Val Tyr Gly Ser 
245 250 255 

Gly Phe Gly Aap Asp Leu Val Ser Aan Met Phe Lya Asp Asp Glu Leu 
260 265 270 

Ala Met Gin Phe Lya Lya Gly Val Glu Glu Ala Ser Lya Phe Leu Pro 
275 280 285 

Lya Ser Ser Gin Leu Phe lie Asp Val Asp Ser Tyr lie Pro Met Asn 
290 295 300 

Ser Gly Ser Lys Glu Asn Gly Ser Glu Val Phe Val Lya Thr Glu Lys 
305 310 315 320 

Lys Asp Glu Thr Glu His His His His His Ser Tyr Ala Pro Pro Pro 
* * 325 330 335 

Asn Arg Leu Thr Gly Lys Lye Ser His Trp Arg Asp Glu Asp Glu Asp 
340 345 350 

Phe Val Glu Glu Arg Ser Aan Lya Gin Ser Ala Val Tyr Val Glu Glu 
355 360 365 

Ser Glu Leu Ser Glu Met Phe Asp Asn Met Phe Leu Cys Gly Pro Gly 
370 375 380 

Lys Pro Val CyB He Leu Asn Gin Asn Phe Pro Thr Glu Ser Ala Lya 
385 390 395 400 

Val Val Thr Ala Gin Ser Aan Gly Ala Lya He Arg Gly Lya Lya Ser 
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405 410 415 

Thr Ser Thr Ser His Ser Aan Asp Ser Lya Lya Glu Thr Ala Asp Leu 
420 425 430 

Arg Thr Leu Leu Val Leu Cya Ala Gin Ala Val Ser Val Asp Aap Arg 
435 440 445 

Arg Thr Ala Aan Val Xaa Leu Arg Gin lie Arg Glu Hia Ser Ser Pro 
450 455 460 

Leu Gly Aan Gly Ser Glu Arg Leu Ala Hia Tyr Phe Ala Aan Ser Leu 
465 470 475 480 

Glu Ala Arg Leu Ala Gly Thr Gly Thr Gin lie Tyr Thr Ala Leu Ser 
485 490 495 

Ser Lya Lya Thr Ser Ala Ala Aap Met Leu Lya Ala Tyr Gin Thr Tyr 
500 505 510 

Met Ser Val Cya Pro Phe Lya Lya Ala Ala He He Phe Ala Aan Hia 

515 520 525 

Ser Met Met Arg Phe Thr Ala Aan Ala Aan Thr He Hia He He Asr> 
530 535 540 

Phe Gly He Ser Tyr Gly Phe Gin Trp Pro Ala Leu He Hia Arg Leu 
545 550 555 560 

Ser Leu Ser Arg Pro Gly Gly Ser Pro Lya Leu Arg He Thr Gly Xaa 
565 S70 575 

Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Glu Phe Arg Arg Gin 

580 585 590 

Val He Ala Trp Leu Aap Thr Val Ser Aap Thr Met Phe Arg Leu Ser 
595 600 605 

Thr Thr Gin Leu Leu Arg Aan Gly Glu Thr He Gin Val Glu Aap Leu 
610 615 620 

Lys Leu Arg Gin Gly Glu Tyr Val Val Val Aan Ser Leu Phe Arg Phe 
625 «0 635 640 

Arg Aan Leu Leu Asp Glu Thr Val Leu Val Aan Ser Pro Arg Aap Ala 

"5 650 655 

Val Leu Lya Leu He Arg Lys He Asn Pro Aan Val Phe He Pro Ala 
660 665 670 

He Leu Ser Gly Aan Tyr Aan Ala Pro Phe Phe Val Thr Arg Phe Arg 
675 680 685 

G1U L * U Ph ° HiB Tyr Ser Ala Val Pho A8 P Cy» Aap Ser Lya 

690 695 700 

Leu Ala Arg Glu Aap Glu Met Arg Leu Met Tyr Val Phe Glu Phe Tyr 
705 710 715 720 

Gly Arg Glu He Val Aan Val Val Ala Ser Glu Gly Thr Glu Arg Val 

725 730 735 

Glu Ser Arg Glu Thr Tyr Lya Gin Trp Gin Ala Arg Leu He Arg Ala 
740 745 750 

Gly Phe Arg Gin Leu Pro Leu Glu Lya Glu Leu Met Gin Aan Leu Lya 

'55 760 765 

- 133 - 



WO 97/41152 



PCT/US97/07022 



Leu Lys He Glu Asn Gly Tyr Asp Lys Asn Phe Aap Val Asp Gin Asn 

770 775 780 

Gly Asn Trp Leu Leu Gin Gly Trp Lys Gly Arg He Val Tyr Ala Ser 
785 790 795 800 

Ser Leu Trp Val Pro Ser Ser Ser Xaa 

805 

(2) INFORMATION FOR SEQ ID NO: 59: 

<i) SEQUENCE CHARACTERISTICS t 

(A) LENGTH: 90 amino acids 

(B) TYPEs amino acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY s unknown 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59: 

Gin Glu Ala Asp His Asn Lye Thr Gly Phe Leu Asp Arg Phe Thr Glu 
X 5 10 15 

Ala Leu Phe Tyr Tyr Ser Ala Val Phe Asp Ser Leu Asp Ala Ala Asn 
20 25 30 

Asn Asn Asn Asn Asn Asn Asn Gin Arg Met Glu Ala Glu Tyr Leu Gin 
35 40 45 

Arg Glu lie Cys Asp He Val Cys Gly Glu Gly Ala Ala Arg Xaa Glu 
50 55 60 

Arg His Glu Pro Leu Ser Arg Trp Arg Asp Arg Leu Thr Arg Ala Gly 
65 70 7 5 80 

Leu Ser Ala Val Pro Leu Gly Ser Asn Ala 
85 90 

(2) INFORMATION FOR SEQ ID NO: 60: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 199 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Daucue carota 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60: 
TCTGCAGACA ATTTTNAGGA GGCCAATACC ATGCTATTGG AAATTTCAGA ACTGTCCACA 
CCTNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNGTACTTC TCAGAGGNAA TGTCGGNNAG 
ATTAGTTAGC TCCTGCTTAG GAATCTATGC TTCTCTTCCN GCAACAGTGG TGCCTCCTCA 
TGGTCAGAAA GTGGCCTCA 



60 
120 
180 
199 
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(2) INFORMATION FOR SEQ ID NO16I; 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 66 amino acids 

(B) TYPE 1 amino acid 

(C) STRANDEDNESS : single 
<D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: peptide 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Daucus carota 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61: 

Ser Ala Aap Asn Phe Xaa Glu Ala Asn Thr Met Leu Leu Glu lie Ser 

1 5 10 15 

Glu Leu Ser Thr Pro Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Tvr 

20 25 30 

Phe Ser Glu Xaa Met Ser Xaa Arg Leu Val Ser Ser Cya Leu Glv lie 
35 40 45 

Tyr Ala Ser Leu Pro Ala Thr Val Val Pro Pro Hie Gly Gin Lva Val 
SO 55 60 

Ala Ser 

65 

(2) INFORMATION FOR SEQ ID NO: 62: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 321 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Glycine max 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 62: 

TCAACTGAGA ATCTAGAAGA TGCCAACAAG ATGCTTCTGG AGATTTCTCA GTTATCAACA 60 

CCGTTCNNCA CTTCAGCACA GCGTGTGGCA GCATATTTCT CAGAAGCCAT ATCACCAAGG 120 

TTGGTGAGTT CATCTCTAGG CATATACGCA ACTTTGCCAC ACACACACCA AAGCCACAAG 180 

GTAGCTTCAG CTTTTCAAGT GTTCAATGGT ATTAGTCCTT TAGTGGAGTT CTCACACTTC 240 

ACAGCAAACC AAGCAATTCA AGAAGCCTTC GAAAGAGAAG AGAGGGTGCA CATCATAGAT 300 

CTTGATATAA TGCAAGGGTT G 321 
(2) INFORMATION FOR SBQ ID NO: 63: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 107 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
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(D> TOPOLOGY: unknown 

(ii) MOLECULE TYPE: peptide 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Glycine max 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 63: 

ser Thr Glu Asn Leu Glu Asp Ala Aan Lye Met Leu Leu Glu lie Ser 

1 5 10 15 

Gin Leu Ser Thr Pro Phe Xaa Thr Ser Ala Gin Arg Val Ala Ala Tyr 
20 25 30 

Phe Ser Glu Ala lie Ser Ala Arg Leu Val Ser Ser Cys Leu Gly lie 
35 40 45 

Tyr Ala Thr Leu Pro His Thr Hie Gin Ser His Lye Val Ala Ser Ala 

50 55 60 

Phe Gin Val Phe Aan Gly lie Ser Pro Leu Val Glu Phe Ser His Phe 

65 70 75 80 

Thr Ala Aen Gin Ala He Gin Glu Ala Phe Glu Arg Glu Glu Arg Val 
85 90 95 

His He lie Asp Leu Asp He Met Gin Gly Leu 
100 105 

(2) INFORMATION FOR SEQ ID NO: 64: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 195 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : unknown 

( D ) TOPOLOGY : unknown 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Picea 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64: 

TCTGCAGACA ACTTTGAAGA AGCCAATACA ATACTGCCTC AGATCACAGA ACTCTCCACC 60 

CCCTATNGCA ACTCGGTGCA ACGAGTGGCT GCCTATNNNN NNNNNNNNNN NNNNNNNNNN 120 

NNNNNNNNNN NNTGCATAGG AATGTATTCT CCTCTCCCTC CTATTCACAT GTCCCAGAGC ISO 

CAGAAAATTG TGAAT 195 
(2) INFORMATION FOR SEQ ID NO: 65: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 65 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: peptide 
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(vi) ORIGINAL SOURCE : 

(A) ORGANISM: Picea 

(xi) SEQUENCE DESCRIPTION: . SEQ ID NO:65: 

Ser Ala Aap Aan Phe Glu Glu Ala Aan Thr lie Lau Pro Gin lie Thr 

^5 10 15 

Glu Lau Sar Thr Pro Tyr Xaa Aan Sar Val Gin Arg Val Ala Ala Tyr 

20 25 30 

Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Cya I la Gly Mat 

35 40 45 

Tyr Sar Pro Lau Pro Pro lie Hie Met Ser Gin Ser Gin Lya lie Val 

50 55 60 

Aan 

65 

(2) INFORMATION FOR SEQ ID NO: 66: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2151 baae paira 

(B) TYPE: nucleic acid 

(C) STRANDEDNBSS : unknown 
(D> TOPOLOGY: unknown 

(ii> MOLECULE TYPE: cDNA 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66: 

GATATCAGCA TCATCAATTT TAAATGTAAG TTGGCAAAAG ATCATGAGGG TTCTCATAGT 60 

AATTTGGCCA CAAGGTATGA CACTGTCTCA ATTGAGCAAT CTAGTAGAGA AACTGATCCA 120 

TCATATATTG CTCATATTGA AAGTGAAAAA GATATGCTCA AGAACCTAGT AGAGAAGCTA 180 

AAAATTGAAA AATCTACCTC TACTAGAAAA ATATGATAGG TTGCCTGTTT CTCATGAAAA 240 

TTTATTAGAT AATCATATCA TGGCTAGATG TCGCTCATGA GGTTGTTCTT GCTAGTTTAG 300 

ATTCCTGTGG GCATTCATCT CTTTTAGATG CACTAACATG ATAGGAAGTT TCTAATCTGG 360 

TGCTTCACAA TTCTGGTGAT TCATGCTTCC TTCATTGCAA TTGATATTGA TGCTTGATTC 420 

ATGCTTCAGT CACTTTGTGC GTTTAATTGG TATTGTATGT ATCACTAGAT TGTAGCGTGT 480 

CTGCAACTAG TGTTTCACCA TGTGGTTTTT TAGTATCATT CGTATTAGTT TCTAACTTTC 540 

TATTGATATA TTAAAGTGAT AACTAGTTTT AGAAATATTC TCTTGTGCCA TTAATGCTAC 600 

AACTTOTTTT TAGOGTGTAC OTTAGCATTA TAATATTTCC TTATTATGAA AGCGGAAGAG 660 

AAAOGCGCCC AACCAGAGCA TCCACGTCGT CTCATTTCAC CTTCATCGTT GGATCATAGA 720 

TGAGCGGTCC ACOGTGAACT CCGTTTGCCT GCAAAACCAC GTCCTCTACG CGCTGTTAAG 780 

TAGCTTCTAG AAACATCACG ATGTGTCCCG TCCATTCCTT TAGGAGGAGC CGGATCCGGC 840 

GCCGCAGTCG CCCAAGGTCC CGACCGCCGC GGCCTCGGCC GCCGCCGCCA AGGAGCGGAA 900 
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GGAGGTGCAG 


CGGCGGAAGC 


AGCGCGACGA 


GGAGGGCCTC 


CACCTGCTGA 


GTGCTGACGC 


960 


TGCTOCTGCA 


GTGCGOGGAG 


GCCGTGAACG 


CGGACAACCT 


OGACGACGCG 


CACCAGACGC 


1020 


TGCTGGAGAT 


OGCGGAGCTG 


GCCACGCCGT 


TOGGCACCTG 


GACCCAGCGC 


GTGGCGGCCT 


1080 


ACTTCGCGGA 


GGCCATGTCG 


GCGCGCGTCG 


TCAGCTCCTG 


CCTAGGCCTG 


TACGCGCCGC 


1140 


TGCCGCCGGG 


CTCCCCCGCC 


GCGGCGOGCC 


TCCAOGGCOG 


CGTGGCCGCC 


GCG TTCCAGG 


1200 


TGTTCAACGG 


CATCAGCCCC 


TTOGTCAAGT 


TCTCGCACTT 


CACCGCCAAC 


CAGGCCATCC 


1260 


AGGAGGGGTT 


CGAGCGGGAG 


GAGCGTGTGC 


ACATCATCGA 


CCTOGACATC 


ATGCAGGGGC 


1320 


TGCAGTGGCC 


CGGCCTCTTC 


CACATCCTTG 


TCTCCCGCCC 


CGGCGGCCCG 


CCCAGGGTCA 


1380 


GGCTCACCGG 


CCTGGGGGCG 


TCCATGGACG 


CGCTCGAGGC 


GAOGGGGAAG 


CGCCTCTCCG 


1440 


ACTTCGCCGA 


CACGCTCGGC 


CTGCCCTTCG 


AGTTCTGCGC 


CGTCGCOGAG 


AAGGCCGGCA 


1500 


ACGTTGACCC 


GCAGAAGCTG 


GGCGTCACGC 


GGCGGGAGGC 


GGTCGCOGTC 


CACTGGCCGC 


1560 


ACCACTCGCT 


TTACGACGTC 


ATCGGCTOOG 


ACTCCAACAC 


GCTCTGGCTC 


ATCCAAAGGT 


1620 


CCTCCATTTT 


CCTTCTCTGC 


CTTTCTTCCA 


TGTCAAATCT 


TGATGCAATC 


ATGACCACTT 


1680 


TTCAGCTGCT 


GACATTGGAT 


AATGTGAGCT 


TTACGGCAAG 


CATCAAGTCG 


TGGTAGTACA 


1740 


TCCATTACAG 


CTATTT CT AA 


AATATTCTTC 


GGAGG TTTCC 


TGCTCATAGT 


AAAAAAAAAT 


1800 


CGCGTTTTGA 


AGCTCAAAAG 


GCGATTTCTT 


CCGAGGTTTG 


CTGTTGAGCG 


CTATTTTGGA 


1860 


AACCCCATTT 


TCTCAATTGA 


TTTTTATTTT 


TTAAAGAAAA 


ATTAGTTCAT 


TTTTCTCTTG 


1920 


TGAAATGGAG 


TCCCAAACTA 


ACCCTAATAT 


TAAAAAAAAC 


GCG CTTTGG A 


GCTCAAAACG 


1980 


CTCG TTGTT A 


TGACCAACCA 


GCTTTATAGG 


TTTAAAAAGG 


TTGAATCTTG 


ACAATGCTTT 


2040 


TGAAAAGGTT 


GAATCTTGAC 


AATGCTTTTG 


AGATGATACT 


GTAGTGTAGT 


CTGTAGTGGA 


2100 


GCATCCTCCA 


TGGTCTTTGG 


TGATCGAGAA 


TTCCTGCAGC 


CCGGGGGATC 


C 


2151 


(2) INFORMATION FOR SBQ ID NO: 67: 









(i) SEQUENCE CHARACTERISTICS: 

<A) LENGTH: 716 amino acid* 

(B) TYPE: amino acid 

(C) STRAND EDNKSS : a ingle 

(D) TOPOLOGY : unknown 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 67: 

Tyr Gin Hie Hie Gin Phe Xaa Met Xaa Val Gly Lya Arg Ser Xaa Gly 

15 10 15 

Phe Ser Xaa Xaa Phe Gly Hia Lya Val Xaa Hie Cya Leu Ann Xaa Ala 

20 25 30 

He Xaa Xaa Arg Aen Xaa Ser He He Tyr Cye Ser Tyr Xaa Lye Xaa 

35 40 45 

Lye Arg Tyr Ala Gin Glu Pro Ser Arg Glu Ala Lya Aen Xaa Lya He 
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50 55 60 

Xaa Leu Tyr Xaa Lys Aan Met lie Gly Cys Leu Phe Leu Met Lye lie 
6 * 70 75 80 

Tyr Xaa He He He Ser Trp Leu Asp Val Ala Hie Glu Val Val Leu 
85 90 95 

Ala Ser Leu Aap Ser Cya Gly His Ser Ser Leu Leu Asp Ala Leu Thr 

100 105 no 

Xaa Xaa Glu Val Ser Asn Leu Val Leu His Aan Ser Gly Aap Ser Cys 
115 120 125 

Ph * HAa CyB Aan Xaa T y r XM c Y« Lew llm His Ala Ser Val Thr 

I 30 135 140 

Leu Cys Val Xaa Leu Val Leu Tyr Val Ser Leu Asp Cys Arg Val Ser 
145 ISO 155 ~ * 160 

Ala Thr Ser Val Ser Pro Cye Gly Phe Leu Val Ser Phe Val Leu Val 
"5 170 175 

Ser Asn Phe Leu Leu He Tyr Xaa Ser Asp Asn Xaa Phe Xaa Lys Tyr 
"° 185 190 

Ser Leu Val Pro Leu Met Leu Gin Leu Val Phe Ser Val Tyr Val Ser 
195 200 205 

llm J^S Ile Phe Pro Tyr Tyr Glu Ser G1 y Ar 9 Glu Thr Arg Pro Thr 
210 215 220 

Arg Ala Ser Thr Ser Ser His Phe Thr Phe He Val Gly Ser Xaa Met 
225 230 235 240 

Ser Gly Pro Arg Xaa Thr Pro Phe Ala Cys Lys Thr Thr Ser Ser Thr 
245 250 255 

Arg Cys Xaa Val Ala Ser Arg Asn He Thr Met Cys Pro Val His Ser 
250 265 270 

Phe Arg Arg Ser Arg He Arg Arg Arg Ser Arg Pro Arg Ser Arg Pro 
275 280 285 

Pr ° Pr ° Arg Pr ° Pro Pro Pro Ar 9 Ser G1 Y Cys Ser Gly 

290 295 300 

Gly Ser Ser Ala Thr Arg Arg Ala Ser Thr Cys Xaa Val Leu Thr Leu 
305 310 315 320 

Leu Leu Gin Cys Ala Glu Ala Val Asn Ala Asp Asn Leu Asp Asp Ala 

3 25 330 335 

His Gin Thr Leu Leu Glu He Ala Glu Leu Ala Thr Pro Phe Gly Thr 
3*0 345 250 

Ser Thr Gin Arg Val Ala Ala Tyr Phe Ala Glu Ala Met Ser Ala Arg 

355 350 355 

V * X S " S#r Cy " Leu Gl * L#u T * r Ala Pro Pro Gly Ser 

370 375 380 

Pro Ala Ala Ala Arg Leu His Gly Arg Val Ala Ala Ala Phe Gin Val 

385 390 395 400 

Phe Asn Gly He Ser Pro Phe Val Lys Phe Ser His Phe Thr Ala Asn 

4°5 410 415 
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Gin Ala He Gin Glu Ala Phe Glu Arg Glu Olu Arg Val Hie He He 
420 425 430 

Aap Leu Asp He Met Gin Gly Leu Gin Trp Pro Gly Leu Phe His He 
435 440 445 

Leu Val Ser Arg Pro Gly Gly Pro Pro Arg Val Arg Leu Thr Gly Leu 
450 * 455 460 

Gly Ala Ser Met Asp Ala Leu Glu Ala Thr Gly Lye Arg Leu Ser A«p 
465 470 475 480 

Phe Ala Asp Thr Leu Gly Leu Pro Phe Glu Phe Cya Ala Val Ala Glu 
465 490 495 

Lye Ala Gly Ann Val Aap Pro Gin Lye Leu Gly Val Thr Arg Arg Glu 

1 500 505 510 

Ala Val Ala Val Hia Trp Pro Hie Hia Ser Leu Tyr Aap Val He Gly 

515 520 525 

Ser Asp Ser Aan Thr Leu Trp Leu He Gin Arg Ser Ser He Phe Leu 
530 535 540 

Leu Cya Leu Ser Ser Met Ser Aan Leu Aap Ala He Met Thr Thr Phe 
545 550 555 560 

Gin Leu Leu Thr Leu Aap Aan Val Ser Phe Thr Ala Ser He Lya Ser 
565 570 575 

Trp Xaa Tyr He Hia Tyr Ser Tyr Phe Xaa Aan He Leu Arg Arg Phe 
580 585 590 

Pro Ala Hia Ser Lya Lya Lya Ser Arg Phe Glu Ala Gin Lya Ala He 
595 600 605 

Ser Ser Glu Val Cya Cya Xaa Ala Leu Phe Trp Lya Pro Hia Phe Leu 
610 615 620 

Aan Xaa Phe Leu Phe Phe Lya Glu Lya Leu Val Hia Phe Ser Leu Val 

625 630 635 640 

Lya Trp Ser Pro Lya Leu Thr Leu He Leu Lya Lya Thr Arg Phe Gly 
645 650 655 

Ala Gin Aan Ala Arg Cya Tyr Aap Gin Pro Ala Leu Xaa Val Xaa Lya 

660 665 670 

Gly Xaa He Leu Thr Met Leu Leu Lya Arg Leu Aan Leu Aap Aan Ala 
675 680 685 

Phe Glu Met He Leu Xaa Cya Ser Leu Xaa Trp Ser He Leu Hia Gly 
690 695 700 

Leu Trp Xaa Ser Arg He Pro Ala Ala Arg Gly He 

70S 710 715 
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WHAT IS CLAIMED IS : 

1. An isolated nucleic acid molecule comprising a 
nucleotide sequence that encodes a SCARECROW protein 

5 containing an amino acid sequence substantially similar to 
the sequence of MOTIF III (VHIID) of Arabidopsis SCR protein 
shown in FIGS. 13A-F. 

2. An isolated nucleic acid molecule comprising a 

10 nucleotide sequence that (a) encodes a scarecrow protein 
having the amino acid sequence shown of any one of SEQ ID 
NO:2, SEQ ID NO:19, SEQ ID NO:21 f SEQ ID NO:23, SEQ ID NO:34, 
SEQ ID NO:35, SEQ ID NO:36, SEQ ID NO:37, SEQ ID NO:41, SEQ 
ID NO: 42, SEQ ID NO: 43, SEQ ID NO: 44, SEQ ID NO: 46, SEQ ID 

15 NO:48, SEQ ID NO:50, SEQ ID N0:51, SEQ ID NO:52, SEQ ID 
NO: 54, SEQ ID NO: 56, SEQ ID NO: 58, SEQ ID NO: 59, SEQ ID 
NO: 61, SEQ ID NO: 63, SEQ ID NO: 65 or SEQ ID NO: 67; or (b) is 
the complement of the nucleotide sequence of (a) . 

20 3, An isolated nucleic acid molecule comprising a 

nucleotide sequence that hybridizes to the nucleic acid of 
Claim 2 and encodes a naturally occurring SCR gene product. 

4. A nucleic acid molecule comprising a nucleotide sequence 
25 that (a) encodes a SCR protein lacking one to four of the 
following motifs delineated in FIGS. 13A-F: MOTIF I, MOTIF 
II, MOTIF III, MOTIF IV, MOTIF V, or MOTIF VI; or (b) is the 
complement of the nucleotide sequence of (a) . 

30 5. A nucleic acid molecule comprising a nucleotide sequence 
that (a) encodes a polypeptide corresponding to MOTIF I, 
MOTIF II, MOTIF IV, MOTIF V or MOTIF VI of the SCARECROW 
protein delineated in FIGS. 13A-F; or (b) is the complement 
of the nucleotide sequence of (a) . 

35 

6. The isolated nucleic acid molecule of Claim 1 comprising 
the nucleic acid sequence of any one of SEQ ID NO:l, SEQ ID 
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N0:18 r SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:45, SEQ ID 

NO: 47, SEQ ID NO: 49, SEQ ID NO: 51, SEQ ID NO: 53, SEQ ID 

NO: 55, SEQ ID NO: 57, SEQ ID NO: 60, SEQ ID NO: 62, SEQ ID NO: 64 

or SEQ ID NO: 66. 

5 

7. A DNA vector containing the nucleotide sequence of Claim 
1, 2, 3, 4, 5, or 6. 

10 8. An expression vector containing the nucleotide sequence 
of Claim 1, 2, 3, 4, 5, or 6 operatively associated with a 
regulatory nucleotide sequence containing transcriptional and 
translational regulatory information that controls expression 
of the nucleotide sequence in a host cell. 

15 

9. A genetically engineered host cell containing the 
nucleotide sequence of Claim 1, 2, 3, 4, 5, or 6. 

10. A genetically engineered host cell containing the 

20 nucleotide sequence of Claim 1, 2, 3, 4, 5, or 6 operatively 
associated with a regulatory nucleotide sequence containing 
transcriptional and translational regulatory information that 
controls expression of the nucleotide sequence in a host 
cell. 

25 

11. An isolated SCARECROW protein. 

12. The protein of Claim 11 having the amino acid sequence 
shown in FIG. 5E (SEQ ID NO:2) . 

30 

13. A SCARECROW protein lacking one to four of the following 
motifs delineated in FIGS- 13A-F: MOTIF I, MOTIF II, MOTIF 

III, MOTIF VI, MOTIF V, or MOTIF VI. 

35 14. A polypeptide corresponding to MOTIF I, MOTIF II, MOTIF 

IV, MOTIF V or MOTIF VI of the SCARECROW protein as 
delineated in FIGS. 13A-F. 
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15. An antibody that inununospecif ically binds the protein or 
polypeptide of Claim 11, 12, 13 or 14. 

16. An anti-idiotypic antibody that mimics an epitope of the 
5 SCARECROW protein. 

17. A plant engineered to overexpress or underexpress the 
SCARECROW protein, so that cell division is modified and root 
development is altered 

10 

18. A plant engineered to overexpress the SCARECROW protein, 
so that cell division is increased in roots, resulting in 
thicker root development. 

15 19. A transgenic plant containing a transgene having the 
nucleotide sequence of Claim 1, 2, 3, 4, 5, or 6. 

20. A transgenic plant containing a transgene having the 
nucleotide sequence of Claim 1, 2, 3, 4, 5, or 6 operatively 
20 associated with a regulatory nucleotide sequence containing 
transcriptional and translational regulatory information that 
controls expression of the nucleotide sequence in a 
transgenic plant cell. 



21. The transgenic plant of Claim 19 , in which the transgene 
encodes an antisense molecule that suppresses expression of 
endogenous SCARECROW gene product, so that cell division is 
decreased in roots r resulting in thinner root development. 

22. A genetically engineered plant in which the endogenous 
SCARECROW gene is disrupted or inactivated so that cell 
division is decreased in roots, resulting in thinner root 
development. 



23. A transgenic plant containing a transgene encoding a 
gene of interest operatively associated with a SCARECROW 
promoter, so that the gene of interest is expressed in roots. 
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24. The transgenic plant of Claim 23, in which the gene of 
interest encodes a gene product that confers herbicide, salt, 
pathogen, or insect resistance. 

5 25. A transgenic plant containing a transgene encoding a 
gene of interest operatively associated with a SCARECROW 
promoter, so that the gene of interest is expressed in stems. 

26. The transgenic plant of Claim 25, in which the gene of 
10 interest encodes a gene product that increases starch, lignin 

or cellulose biosynthesis. 

27. A plant engineered to overexpress or underexpress the 
SCARECROW protein so that the stem or hypocotyl gravitropism 

15 is altered. 

28. The plant of Claim 28, which is less susceptible to 
lodging than a wild-type plant. 
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1234567890 1234567890 1?USfi 7890 

GGCACGAGCC CAACGGGTCC TGAGCTTCTT ACTTATATGC ATATCTTGTA 50 
GTSP TGP ELL TYMH ILY 

TGAAGCCTGC CCTTATTTCA AATTCGGTTA TGAATCTGCT AATGGAGCTA 100 
EAC PYFK FGY ESA NGAI 

TAGCTGAAGC TGTGAAGAAC GAAAGTTTTG TGCACATTAT CGATTTCCAG 150 
AEA VKN ESFV H I 1 DFQ 

ATTTCTCAAG GTGGTCAATG GGTGAGTTTG ATCCGTGCTC TTGGTGCTAG 200 
ISOG GOW VSL IRAL GAR 

ACCTGGTGGA CCTCCGAACG TTAGGATAAC GGGAATTGAT GATCCGAGAT 250 
PGG PPNV RIT GID OPRS 

CATCGTTTGC TCGTCAAGGA GGACTTGAGT TAGTTGGACA AAGACTTGGG 300 
SFA RQG GLEL V G Q RLG 

AAGCTAGCTG AAATGTGCGG TGTTCCGTTT GAGTTCCATG GAGCTGCTTT 350 
KLAE MCG VPF EFHG AAL 

ATGCTGCACG GAAGTCGAAA TCGAGAAGCT AGGAGTTAGA AATGGAGAAG 400 
CCT EVEI EKL GVR NGEA 

CGCTCGCGGT TAACTTCCCG CTTGTTCTTC ACCACATGCC TGATGAGAGT 450 
LAV NFP LVLH HMP DES 

GTAACTGTGG AGAATCACAG AGATAGATTG TTGAGATTGG TCAAACACTT 500 
VTVE NHR DRL LRLV KHL 

GTCACCAAAC GTTGTGACTC TGGTTGAGCA AGAAGCGAAT ACAAACACTG 550 
S P N V V T L V E 0 E A N T N T A 

CGCCGTTTCT TCCCCGGTTT GTCGAGACAA TGAACCATTA CTTGGCAGIT 600 
PFL PRF VETM NHY LAV 
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1234567890 1234567890 1234567890 1234567690 1234567890 
TTCGAATCAA TAGATGTGAA ACTCGCTAGA GATCACAAGG AAAGGATCAA 650 
FESI DVK LAR DHKE RIN 

TGTTGAGCAG CATTGTTTGG CTAGAGAGGT TGTGAATCTT ATAGCTTGTG 700 
VEO HCLA REV VNL IACE 

AAGGTGTTGA AAGAGAAGAG AGGCACGAGC CACTAGGGAA ATGGAGGTCT 750 
GVE REE RHEP LGK WRS 

CGGTTTCACA TGGCGGGATT TAAACCGTAT CCTTTGAGCT CGTATGTGAA 800 
RFHM AGF KPY PLSS YVN 

CGCAACAATC AAAGGATTGC TTGAGAGTTA TTCAGAGAAG TATACACTTG 850 
ATI KGLL ESY SEK YTLE 

AAGAAAGAGA TGGAGCATTG TATTTAGGAT GGAAGAATCA ACCTCTTATC 900 
ERD GAL YLGW KNQ PLI 

ACTTCTTGTG CTTGGAGGTA ACTAATAAAA ACCTTGTTCG GTTTCAGAAG 950 
T S C A W R X 

AGATTAGAAA CTTCTTTTAA AGTTTGCAGA ATCTGTTTGT AAAAGTAAAA 1000 

CTCATGCATG ATCCGNAGGA ACAAGTTGTC AAATGTTGTA GTAGTAAGTG 1050 

ATATGTTGAT GACCCMAAA AAAAAAAAAA AAAAA 1085 
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GCTATGGAAG GAGAGAAGAT GGTTCATGTG ATTGATCTCG ATGCTTCTGA 50 
AMEG EKM VHV IDLO A S E 

GCCAGCTCAA TGGCTTGCTT TGCTTCAAGC TTTTAACTCT AGGCCTGAAG 100 
PAO WLAL LOA FNS RPEG 

GTCCACCTCA TTTGAGAATC ACTGGTGTTC ATCACCAGAA GGAAGTGCTT 150 
PPH LRI TGVH HQK EVL 

GAACAAATGG CTCATAGACT CATTGAGGAA GCAGAGAAAC TCGATATCCC 200 
EOMA HRL IEE AEKL DIP 

GTTTCAGTTT AATCCCGTTG TGAGTAGGTT AGACTGTTTA AATGTAGAAC 250 
FQF NPVV SRL DCL N V E Q 

AGTTGCGGGT TAAAACAGGA GAGGCCTTAG CCGTTAGCTC GGTTCTTCAA 300 
LRV KTG EALA VSS V L Q 

TTGCATACCT TCTTGGCCTC TGATGATGAT CTCATGAGAA AGAACTGCGC 350 
LHTF LAS ODD LMRK NCA 

TTTACGGTTT CAGAACAACC CTAGTGGAGT TGACTTGCAG AGAGTTCTAA 400 
LRF ONNP SGV DLO RVIM 

TGATGAGCCA TGGCTCTGCA GCTGAGGCAC GTGAGAATGA TATGAGTAAC 450 
MSH GSA AEAR END MSN 

AACAATGGGT ATAGCCCTAG CGGTGAGTCG GCCTCATCTT TGCCTTTACC 500 
NNGY SPS GDS ASSL PLP 

AAGTTCAGGA AGGACTGATA GCTTCCTCAA TGCTATTTGG GGTTTGTCTC 550 
SSG RTDS FLN AIW GLSP 

CAAAGGTCAT GGTGGTCACT GAGCAAGACT CAGACCACAA CGGCTCCACA 600 
KVM VVT EQDS DHN GST 
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CTAATGGAGA GGCTATTAGA ATCACTTTAC ACCTACGCAG CATTGTTTGA 650 
LMER LLE SLY TYAA LTD 

TTGCTTGGAA ACAAAAGTTC CAAGAACGTC TCAAGATAGG ATCAAAGTGG 700 
CLE TKVP RTS QOR IKVE 

AGAAGATGCT CTTCGGGGAG GAGATCAAGA ACATCATATC CTGCGAGGGA 750 
KML FGE EIKN IIS CEG 

TTTGAGAGAA GAGAAAGACA CGAGAAGCTT GAGAAATGGA GCCAGAGGAT 800 
FERR ERH EKL EKWS QRI 

DGATTTGGCT GGTTTTGGGA ATGTTCCTCT TAGCTATTAT GCGATGTTGC 850 
DLA GFGN VPL SYY AMLQ 

AGGCTAGGAG ATTGCTTCAA GGGTGCGGTT TTGATGGGTA TAGAATCAAG 900 
ARR LLO GCGF DGY R I K 

GAAGAGAGCG GGTGOGCAGT AATTTGCTGG CAAGATCGAC CTCTATACTC 950 
EESG CAV ICW ODRP LYS 

GGTATCAGCT TGGAGATGCA GGAAGTGAAT GATATATTAC AGTTTGTCTT 1000 
VSA WRCR KX 

CTATTTTGGT TATGAGCAGA GTCCCTTTCT TTTTTGTATA CATGGGGACA 1050 

CAATCTTAGT TGTTTTGTGA TGGTGACTTT CTGTCTCTTT ATGCTATTTT 1100 

GGCTTAAATG CTTCTACTGC CTCTGCATGT AAAGCCTTTG TGTGTTGGTT 1150 

CAATTTGGTC TGGTGTGGGT GTAATACCAA ACCAAATCCA ATTTGAGCTG 1200 

AAGATAACTA ATTTGATGAT CGGCTCGTGC C 1231 
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CCAGGAGGCGTTCGAGCGGGAGGAGOGTGTGCACATCATCGACCTCGACA 
QEAFEREERVHI IDLDI 

60 70 80 90 100 

TCATGCAGGGGCTGCAGTGGCCGGGCCTCTTCCACATCCTTGCCTCCCGC 
MOGLOWPGLFHILASR 
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CACGCGTCCG TCAAAGGATA CAACCATGTA CACATAATTG ACTTTTCCCT 50 
HASV KGY NHV HIID FSL 

GATGCAAGGT CTCCAGTGGC CGGCACTCAT GGATGTCTTC TCCGCCCGTG 100 
MOG LQWP ALM DVF SARE 

AGGGTGGGCC ACCAAAGCTC CGAATCACAG GCATTGGCCC GAACCCAATA 150 
GGP PKL RITG IGP NPI 

GGTGGCCGTG ACGAGCTCCA TGAAGTGGGA ATTCGCCTCG CCAAGTATGC 200 
GGRD ELH EVG IRLA KYA 

ACACTCGGTG GGTATCGACT TCACTTTCCA GGGAGTCTGT GTCGATCAAC 250 
HSV GIDF TFQ GVC VDOL 

TTGATAGGTT GTGCGACTGG ATGCTTCTCA AACCAATCAA AGGAGAGGCA 300 
DRL CDW MLLK PIK GEA 

GTTGCCATAA ACTCCATCCT ACAACTCCAT CGCCTCCTCG TTGACCCAGA 350 
VAIN SIL QLH RLLV OPD 

TGCAAACCCA GTGGTGCCCG CACCAATAGA TATCCTCCTC AAATTGGTCA 400 
ANP VVPA PIO ILL KLVI 

TCAAGATAAA CCCCATGATC TTCACGGTGG TTGAGCATGA GGCAGATCAC 450 
KIN PMI FTVV EHE ADH 

AACAGACCAC CACTACTAGA GAGGTTCACT AATGCCCTCT TCCACTATGC 500 
NRPP LLE RFT NALF HYA 

GACCATGTTT GACTCTTTGG AGGCCATGCA TCGTTGTACC AGTGGTAGAG 550 
TMF DSLE AMH RCT SGRD 

ACATCACCGA CTCACTCACA GAGGTGTACC TTCGAGGTGA GATTTTTGAC 600 
ITD SLT EVYL RGE IFO 
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1234567890 1234567890 1234567890 1234567890 1 934567890 
ATTGTCTGCG GCGAGGGCAG TGCACGCACC GAACGTCATG AGTTGTTTGG 650 
IVCG EGS ART ERHE LFG 

TCACTGGAGG GAGAGGCTCA CCTATGCTGG GCTAACTCAA GTGTGGTTCG 700 
HWR ERLT YAG LTQ VWFD 

ACCCCGATGA GGTTGACACG CTAAAAGACC AGTTGATCCA TGTGACATCC 750 
PDE VOT LKDQ L I H VTS 

TTATCTGGCT CTGGGTTCAA CATCCTAGTG TGTGATGGCA GCCTTGCACT 800 
LSGS GFN I L V COGS LAL 

AGCGTGGCAT AATCGCCCGT TATATGTGGC AACAGCTTGG TGTGTGACAG 850 
AWH NRPL Y V A TAW CVTG 

GAGGAAATGC TGCCAGTTCC ATGGTTGGCA ACATCTGTAA GGGTACAAAT 900 
GNA ASS MVGN ICK GTN 

GATAGTAGAA GAAAGGAAAA CCGTAATGGA CCCATGGAGT AGCAGGAAGA 950 
OSRR KEN RNG PMEX 

ATAACCATGT CATGAGCAAA TCGATCAAGT AATAAAATGC ACTGATGACA 1000 

TGCATGGTGA TCTAAAGTTT TTTTGCGTGA ATGTGCAATG ACGAATTGTT 1050 

CAATTTGAAT AACCTAATCA TGAGACTCAA AAAAAAAAAA AAA 1093 



FIG.11B2 



SUBSTITUTE SHEET (RULE 26) 



WO 97/41152 



PCT/US97/07022 



25/74 



oooooooooooo 
tnomotnoinomoino 



o o o 

OHO 
UUk< 




Bo eS 8 o 

O O O E4 




HOHOfrjOOOEjOO** 
InUOrt B 

o £ o & o 

O^OHOHO 
£4 §4 O Ej Ej O #oJ 

MO 

00000 





O < O 
%i O O E4 O 
§4 O En E4 O 
OOOOO 
O O O »<* E4 
O O O O < 
O E4 O O O 

O O O O E4 



O O O E4 
O «U 





O O 

8S§g§ 




O O 
Ej H E-» 

*d o o 
< o o 
000 

HUB 

O S E4 O 



o 

SS 

O O 94 



CVJ 
CD 



SUBSTITUTE SHEET (RULE 26) 



WO 97/41152 



PCT/US97/07022 



26/74 



o 


o o o 


O 


o o 


o 


o 


o 


in 


o in o 


in 


o in 


o 


in 


o 




H H (M 


CM 


o n 


rr 




m 



< a o 
H «< «5 

U O #a$ 

o o a 
3gB 





<»} «s *3 
o o o b 



O H < 
O O H 

t» o o 
a o < 

*3 ^ 

8 . 



a 
o 

B 





CD 



SUBSTITUTE SHEET (RULE 26) 



WO 97/41152 



PCT/US97/07022 



27/74 



& 



a* w &j 



CO CO 
CO CO 



CO CO CO 
I *C l 



CO 
Q* 



CO 
M 
Of 

•J 

a, 
w 
Cn 



O O O 

CU I I 
CO I I 

Q g 25 

CO CO CO 

CO CO CO 

CO CO CO 

CO CO < 

a* co co 

a Q D 

J H H 

^ £ < 

2 a, i 

J tJU ft« 

co C O 

K Jx 2 
H J 

2 < M 




JO 



04 *H ^|» QjJ 

CO 

SUBSTITUTE SHEET (RULE 26) 



o 2 2 gag g^S « 



WO 97/41152 



PCT/US97/07022 



OS U 

to o» 



28/74 



CD 

ro 
CD 



5* 



^ , O CO 

K r 1 ^ 0{ H H ^ o> 

^JfeCu O .H fn fa co 

W ^ ^ w h H ^ n 

SUBSTITUTE SHEET (RULE 26) 





CO 


O 




o> 




r-l 


«H 




00 




CO 


iH 


fa 


fa ON 


CO 




«H 




Eh ro 



WO 97/41152 



PCT/US97/07022 



29/74 




E« to 

a; >4 ct, 

to u u 

in bi hi 



o 

ro 
CD 



CO o 
« H H 



ro 



o> «— 1 
t co m 



SUBSTITUTE SHEET (RULE 26) 



i-H 

U U 

to to 

e e 



WO 97/41152 



PCT/US97/07022 



30/74 



M 

o 
s 




Q W 

tl4 J 



Q « < 
WWW 

< < < 

►J W J J 



o « aw as 

« O U H < 3 
J J J J J J 






co o in 

o; r-* «h m 

U » h m 

t/3 ^ h n 







o 




cn 


rH 




GO 








00 









c 



co o in 2 <^ 



SUBSTITUTE SHEET (RULE 26) 



WO 97/41152 



PCT/US97/07022 



31/74 




< 
z 

tu 
CO 
Q 
E-« 

P 
to 

co 

a. 

cu 

i-3 
CO 
CO 
< 
CO 
Q 
O 
CO 

Ql* 
CO 
X 
O 



d x o» 
oi oi a: 
w 




JO 
CD 



CO O U-l rH 

n ^! "1 n r 1 ? n 

^ ^ h n H H H 



o 

oo o m rH 
K h h n H 'j n 
U co h m fa oo 
W ^ h n h h h 



SUBSTITUTE SHEET (RULE 26) 



WO 97/41152 



PCT/US97/07022 



32/74 



ro 



u 




. Q 

O U Du 

M M M M 
W OOB 

« X « o 

O O C3 01 

Cu J J J 

I I I 



3 



o o o 



K 5 b. 

X 3 O 

u t en 



w S u 
O 2 co 



i o 



CO CO 

•J s 




a. 
O 



2 
W 



to 

Q 
2 
E-« 

u 

M 
2 
O 



CO 

to 



00 O in 

tc h H n h tj" m 

O oo tH a> fc. b oo 

W ^ H n ^ H 



o 

oo o tn cn tH 



oo 



SUBSTITUTE SHEET (RULE 26) 



WO 97/41152 



PCT/US97/07022 




WO 97/41152 



PCT/US97/07022 



34/74 



S ox 
as 

2 



o;oninioo«o«do^«amnjBqjni<ani-Qio 
cotocotatocQCotQtotocQtotatoiotocotot'* 



V) 



a 
•J 

Cm 

o 
to 
a 
z 

M 

Q 
►4 

a 



> 
•J 

X 

o 
•J 



»4 

l-J 



o 
I 



to 

CD 



o 
in 



M (MD H vo in 
•a QJlocoo^^on 
EiWOirocoooina* 
O SI nci^rt«n 




SUBSTITUTE SHEET (RULE 26) 



WO 97/41152 



PCT/US97/07022 



35/74 



< 
w 

04 

a 
»-] 
w 

fa 

a 
*-* 

Q 

a 
>* 
o 



o 

H 
O 

55 

to 
c/i 

04 
04 

a 



ss 

fa 
Q 
.-J 
W 

g 

CO 
O 

cu 

o 
a 

04 

>* 



21 2 ^ *2 w •*» «"* on ooro ^ h h n 



SUBSTITUTE SHEET (RULE 26) 



WO 97/41152 



PCT/US97/07022 



36/74 



O 
CD 



55 
2 
Z 
Oi 

CO 

04 



o 
in 
l 



UChCOHVO^inHfO^tVOH^OCNOD^O 
OCOO^h^OnU)H^O^M»MHmH(NH 

ntN^HCNrOH on 00 00 ^ H H 

■H rH r-l HM HH CM 



CO 

o 

rn 



SUBSTITUTE SHEET (RULE 26) 



WO 97/41152 



PCT/US97/07022 



37/74 



04 

04 
04 
04 

a* 

04 
04 
04 

25 

CO 

co 

CO 
CO 

o 

CO 
CO 
fH 

PS 

•J 

04 

CO 

Q4 
04 
04 

Oi 



o 

CO 

u 

CO 

o 

Q 

04 

CO 

as 

CO 



H O 
co in 

04 

CO 

o 
< 

2 

OI 
«C 
CO 

t-4 
04 

CO 

co 

CO 
O 
CO 
H 



CD 

o 



Uoowr^^ , orovOHM)cn < w>wHinHO)Hr^ 
yjmrncooomo^tNr^mHHh'Ovooor^HH 

f1(M«H(NfOH On CO 00 H H f> 

»-•«-» »H H (N H H CM m 



SUBSTITUTE SHEET (RULE 26) 



WO 97/41152 



PCT/US97/07022 



36/74 



T 



I CO 

i a. 




H 

CO 

Q 

CO 

< 



O 
O 



a* 



a 

H 
X 

a 



CO 
H 



CO «H 

2 in 



LO 
CD 



o co 



UOOCTir^^OnvDHUJ^iHtWHinHINHr^O 

cocrirna3ooinc7>CNr^c^rHHE-*rnvocDr~r-iiH\ 

C) CM H (N n H O f) CO 00 « H H 

«H«-» iH rH CS» H H CN 



SUBSTITUTE SHEET (RULE 26) 



WO 97/41152 



PCT/US97/07022 



39/74 



• o r> • • • q • • o 

. co < H tn 

' W Q W tH 

• « H H 

:33S :::::: 

•haco 

. U co co 

■ co a a 

• < o q 

• w q q 

• www 

• > OCX 

. g co co 

Shh 

> >i *i a 

woo 

q w w 

£d2 :::::: 
s > > 

:::::: 

.-J 32 2S 

Q W W 

Q CO CO 

O CO CO 

Sgg :::::: 

to « uz 

O >* >* 

^ O O 

> .-J *J 

< > > 

vi<< 

SS ij 

O Hi iJ 

OHM 

O Q Q 

O Q 

OSS cH 

o S o o 

H O 55 .H 



to 



MWCOHVO^WHntfvOH^OOJtDO^Ot 

^ CN tf H OJ (»1 H O <*> GO CO rH r-4 

iH fH «H HOI HH CM 



SUBSTITUTE SHEET (RULE 26) 



WO 97/41152 



PCT/US97/07022 



40/74 



CO 

cm 
a 
to 

CO 

w 
co 
cu 
a 



o 

CO 

O 

M 
O 



O 

o 

CN 



O 

z 
o 

Oi 



CO 

> 

CO 
H 
CO 
CO 
X 



O 
O 
fu 
•J 

o 

X 
Q 

co 

8 
8! 

O 
CO 

« 

> 

a* 
o 
a 

CO 

co 
o 

CO 

o 
o 

•-1 

CO 

a 



(0 

H 

►4 



CD 



in 



U(MDH(0«inHin<ft0H5tO(N000VO<t 

uooair-^fonvoHkOcn'w'WHi/iHMHr^ 

n N H <N I*) H O 1*1 ao ax* H H 

•H «H t— t H (N HH CNI 



SUBSTITUTE SHEET (RULE 26) 



WO 97/41152 



PCT/US97/07022 



41/74 



04 
< 

Q 
H 
CO 
CO 
Z 

s 



> 

CO 

> 

Q 
Q 
Q 
O 
O 
CO 

w 

a 
a 
o 
o 

►J 

o» 
w 

a 

CO 



w 

CO 

w 



o 

CN 



a. 
w 



u 

> 
o 

H 
i-l 
H 

a, 

25 • • CN 

M0j(DH«)^inHnTl'\0H^OtN000»O^ 

nwM-HcNnn oro cdoo^hh 

•-If-* iH i— I CN H H CN 



SUBSTITUTE SHEET (RULE 26) 



WO 97/41152 

42/74 



1 < • 


• H 


: 3 • 


• o 


1 o • 


• > 


1 Oi • 




1 l-J • 


• o 


1 »J • 




1 I-) • 


• 2 


1 n • 




1 t-3 ♦ 




c £ • 


• < 


o a ' 




-H O • 


• o* 


+J w • 


• a* 


IT! De] * 




N O • 




T ~* OI • 


• 5 


1 u « • 


• S 


1 «) o» • 


• DC 


c -h o: • 


. e> 


•H "D J5 • 


• >* 


«a — »~» • 


• b 


e w • 




O M W . 




"O H K • 


• < 






^•H g . 


• • 




• • 


«H O 




a «: • 


• * 


P* w • 




H I 2 • 




Nth- 




JQ I 55 • 




H • 




I O • 




! > • 




4 < • 




— o« • 




> • 








< • 








> • 




H • 




W * 




(U • 




O • 




(U • 





MCh(DHvO^U)HrO^U)H^O(NCD^O^ 

n CM ^ H <N CO H OH GO QO H H 

•HfH «H H<N H H OI 



SUBSTITUTE SHEET (RULE 26) 



WO 97/41152 



PCT/US97/07022 



43/74 



u 
to 

S5 
►■3 



T 
I 
I 



CO 



w 
co 

5 



4J 

id 



c < 

" CO 

O 
>i 

tu 

H 
d) CO 

as 

•o CO 
— m 
W 

M »-J 
•H 

s w 
w 

I 
I 
I 
I 

I CO 

I > 



M- 
z 

3 



>* 2: v) 
moo 

O w w 

sag 

>« * 
H M M 
2 PC « 



9 



SO Cu 
O O 
MUM 

w S S 




a 



O 

in 



O 



^0>COHVO^inHfO^VOH^O(NC30ChO^ 
UOOO^r^^|"OmvOrHUlG>«MtMrHinrHCNrHr^ 

wo^rococomc^cNr^o^HHHrnvooor^HH 
rotN^HrMnH Oro go oo ^ h h 

rHfH rHrHCN tH «H CM 



SUBSTITUTE SHEET (RULE 26) 



WO 97/41152 



PCT/US97/07022 



44/74 



i co u 

I Q CO 

— « 52 

sag 

M W M 

p o« w 

> »H Q 



» 







• ac • 


:£ : : : 




• Pa • • • 


. o • 


. o • • • 
• w • • • 


• SB • 


• I'M . 


• 32 • • • 


• CO • 


• • • • 


• pa • 


• • • • • 


• .-a • 


• • • • • 




• • • • • 


• < • 




• d • 


: : : : : 


• 55 • 




• CO • 




> • ♦ 


• • ♦ • • 

• • • • • 


:S : 


• • • • • 






* • • • • 


:S : 






• • • • « 


• to • 


• • • • • 


• M • 


• • • » • 


. w . 


• • • • • 


. to • 




• PM • 


\ \ ' \ \ 


• to • 




. X . 

• X • 





• • • 

VAA 

• • • 








• £m • 




• n . 

• .-1 • 




. to • 




. t) • 




• < • 




. CO • 




• • 




• H • 




• O • 




• CO • 

:£ : 




• • « 

LH! 

• • • 


• • ■ • • 






o o • 
pa otss 

< < O < < 

HHWHH 



CO << «< 
5C O* O* 

M Cm Cm 

<« Pm Cm 
2 « « 



o Pa 




X Cm Pm 
O O O 



£3 



Cm 



Cm 
X 
Cm 



Cm Cm 

O O 

Cm Cm 

O 10 O 

< a > 
pq ca ca 

X Cm 

' o» 



X 

1-3 



2 

X 



o 

H 
to 



►J 
•J 

pa 
cu 
o 

H 

Cm Cm fM 

to to pa 



o 
x 

pa 

> 



< 
w 

Q 
CO 



o 
o 



in 
o 



in 

no 



MO>C0HvO^"lOH 
CaOtrOGDODtnOlCN 



h Hn vooorsHH 

o m 00 00 ^ H H 

H(N HH CN 



SUBSTITUTE SHEET (RULE 26) 



WO 97/41152 



PCT/US97/07022 



45/74 



o 
w 
o« 



a* 

CO 

> 

H 

M 
*J 
►J 
CO 
»-) 

s 

S 



« 

3 

»-) 
CO 

O 
O 
O 

o 

M 
Q 
(n 
Q 



35 
►J 



P< 

O 
W 

P« 

PS 

CO 

£ 

< 

o» 

•-3 



w 

CO 

«: 

Q 
Q 




»-) ^ 

AQQ 
PC CO Cm 
tu 25 SB 

g q q 

05 0* 0« 



PU P* OU Z > M 

►-j p* o< a* an pi 

W O O O > J 

HHHHJS-g 

OOOOCUO.O«p« 

£ d 2 • • • z 

PS PS PS PS • • • O* 
i-) (m b< ^ • • • CO 





>4 o» o oi o o« o< w 

HjJJIijHHHH 
OC0C0C0O»O»QQ 

QQQQQQQQ 



o 
in 



CD 



nW^HPJDH On (D 00 t H H 



SUBSTITUTE SHEET (RULE 26) 



WO 97/41152 



PCT/US97/07022 



46/74 



w 



t « H 



H Oi 

«m Cm co 
■h a j 




I 

w 



CO 
CO 

a* 

CO 
M 

CO 

o 

►J 
Q 
•-1 
CO 



S3 
o» 

O 

CO 



w 

CO 

< 

Cm 



•J 

Q 
O 

Cm 



^ > > 
CO CO CO 




04 a* *C K 

CO Cm Cm \A Cm 

> rtj <«C m 

> H H ^ 
< S h h 

O O O Q O 

« (m 55 Cm Cm 



« CO CO M 

»J 0< cm Cm 
2 2 as « 
»-l t-i i-i 

Q W W • 

ess 

Oi co co 

M < < Q 

h a a u 

0 q a 2 o 
S3 j 

t-4 CO H O* M 
»-5 * 

01 . 

H > > O 
Cm Cm > 




HOW 

cm S 




H 

co O O 

j « « 

« >4 X 

Cm W W 
Cm Cm 



O Q O > < 




w w w w 



o 
o 
in 



JO 
CD 



in 



rOM^HcNrnn on co co in ^ 



SUBSTITUTE SHEET (RULE 26) 



WO 97/41152 



PCT/US97/07022 



47/74 



8 



-P 
O 
33 

CO 
I X 

i <y 

i • 

i • 

i • 



3 



►J Q W 

co a a 
« a a* 

(0 H 
* >« 
•J O 

H 5 

t-J M 

o u 

•J X 



QOOOOOQQ 

• M & Ot b 04 b (X 

1 k m p • • • • 

• o o > • • • • 



h3 



X 



h3 »J ^ 

S3 53 S 

X X X X 



X 



> 

X 



t*4 U* *1 >i »_J 



X 

U4 



O 

in 
in 



in 

CD 



o 
in 



McnooHvo^inHn^voH^ocNcoo^o^ 

O M 5j« H (M n H O m GO 00 5t H H 

•H rH rH H <N HH CN 



SUBSTITUTE SHEET (RULE 26) 



WO 97/41152 



PCI7US97/07022 



48/74 



8 



i 



O 
CO H 
Cm « 

PC w 

go. 

s 

o 

CO 

o 
> 



2 S Em ^ 

H Pu Ou 
CO 0* CU Cu 

o co rii S 
gzzz 

Q 5c CO < 

co a o 

Q Z 2 



8 




CO W H W 

•J 0013: ototooi 

2 Pm fM ^ ^ 
>hHH 

O 0« Cm Cm Cm 
WW 





a m m 

c* o« « CO _ _ 

o e> 5? pt? S§ 
- - ij »J ^ iJ 

H K S W M 
O Q O O Q Q 




^ co co co co 



o 
o 
to 



in 
in 



UcDa^r^^ornvorHvoa^^MH^fHinr-irsifHr^ 
wo>ncoa)in^<NM^HHHnvoa}hHH 
ron^HoinH o m co co ^ h h 



SUBSTITUTE SHEET (RULE 26) 



WO 97/41 152 



PCT/US97/07022 



49/74 




DC •05 

w < 3 

www 

ass 

O Q Q 



>* m a a z 





o o 

M W 
O O 



w w 
o u 

to < 



o 
w 



o o o w 

W Q W W 

quo 
> 



u 




cxowQ 



o o o 
www 

DO 
HKQ2 



2 

o 

X 

> 



o 
in 

SO 



*4 
►4 



« PC « X X O 
ttOOOOIX 
>4 »-) 1-1 »-) 
fa 



HUH 
•J fa 

w i-i S3 



o w 





a 

----- gpgeg 

H PS PS Pi P5 W 

- - _ m o« w a w 2 

wooqqHwo 

cupuhcuiccuouc/) 
u • • • • • 

• • • • • 

_J > _i < 

Q Q W Q W 
H H 5 J J 

w w w w w 
w w w w o 

W t-1 w w 





lO 



o 

V0 



H££r?*2£ in '- ,f *>'«»'*0»-«* , oc\ia>o»o«<r 

on CO CO H H 



SUBSTITUTE SHEET (RULE 26) 



WO 97/41152 



PCT/US97/07022 



50/74 



0. CXS5 (U 




H < W W < 

SW>W2 
#C < H co 



• 

3(0 CO CO o 

»-J 04 EC »-« 

CO Ott H Pk 

M > > > > 

O 0. (X 0. < 

« « w to CO 

fa lu fa J 

o o o o o 




u co > < q 

CO < OE^ P5 
WHU sw 

UP<P* 



> • H W S 

W • O M 2 

o *c 04 a, w 

0* S 55 CO 0j 

co 2 2 « 2 

04 • Q • < 



a^^gg^gg : : : : : 

CO »A CO CO i-J 

H Z *( nt O 

>* > *C 25 #*J H H 

W < > M > > 

. QCOXtttttf 

X . . • •COCO0«E-<tOCt)<O 
WW25QWOOQWWW WW 

a h o. p« is a s o* a a. a* a* 

p««>K<^Hh«W2W 
fafafai-3;fei^fa l _)fafab4fafa 

ooooooooooooo 
hjaitfoictww^ 

p$0:2220;«22« 

co ps oi x oi u» 2 2 Pi 




WWWWWWWWWWUWW 



s 

fa 

04 



wwwwwqq^wwoww 



in 



ucoa»r-^rocn\o»-ivoa>«f-i«M»HinrHtNfHr^ 
co<7irooooOirjo>oir>cn^HHf T >u)a)r^r-)^H 
rn cn h (N n h o cn oo oo h h 

•H rH f-i »H <N H H (N 



SUBSTITUTE SHEET (RULE 26) 



WO 97/41152 



PC1VUS97/07022 



51/ 74 



•r4 

0 

s 




• NS to W 
H • to • 

xxxx 

Q O H O 




O 
O 

O 



to 
to 



O P« cu 0< 

« X « « 



• *C M 

X 55 X 

SB tO tO H H O O* 
O m to to to H 55 

a, ^ »j > a* 
> 3 2 o « 



o 

in 



2 



2 2 w 




£88883 

> t3 ^ ^ ^ g 

HQQQZIO 



O < tM O 

d d ^ *J 

Z •<! !h X 

•J J J u 

u to 3 2 

o o o o 



o u to 

&U Ct4 &4 

U X 

> ^ 

o w w 

o o o 

W W 



tO H 
Q Q 



• tu Em 
PS Q H 
H 2 OS 
O X h 
Q X Oi 
ft* OS »J 




LO 
CD 



Uff»OOHVOTTmHrnTtvDH^OC\IOOa>OT 

n (v h n «n h o n oo oo ^ h h 

•H tH «H H (N H H CN 



SUBSTITUTE SHEET (RULE 26) 



WO 97/41 152 PCT/US97/07022 



52/74 



in 



CO 
LO 



MOOOrlVO^inHn^VOH^OCNOO^ 

nw^HwnH o m oo oo h 

•HrH tHfHCN H H <N 



O ^ 



SUBSTITUTE SHEET (RULE 26) 



WO 97/41152 



PCT/US97/07022 



53/74 



id 
CO 




CO 
CD 



SUBSTITUTE SHEET (RULE 26) 



WO 97/41152 



PCT/US97/O7022 



54/74 



in 
n 
o\ 



id 

CO 




GO 
CO 

CD 



SUBSTITUTE SHEET (RULE 26) 



55/74 




SUBSTITUTE SHEET (RULE 26) 



56/74 




SUBSTITUTE SHEET (RULE 26) 



WO 97/41152 



PCT/US97/07022 



57/74 



CO 

ro 

CM 



so 
id 

« 

CO 




CO 
CD 



SUBSTITUTE SHEET (RULE 26) 



WO 97/41152 



PCT/US97/07022 



58/74 



CD 
CD 



m 
m 
\o 
cn 

CM 

I** 

CM 



OS 
CQ 




SUBSTITUTE SHEET (RULE 26) 



WO 97/41152 



PCT/US97/07022 



59/74 



W E-« 

6-« 6-i 
O U 



3 

E-« 

5 



u o 



O Eh 



I 

CD 
CD 



E-« O 

O O 

< *<: 

u o 



SUBSTITUTE SHEET (RULE 26) 



WO 97/41152 



PCT/US97/07022 



60/74 



10 
o 



co 
id 

PS 
co 




CD 
CO 

CD 



SUBSTITUTE SHEET (RULE 26) 



WO 97/41152 



PCT/US97/07022 



61/74 



CM 



id 

CO 



% 

Eh 
O 

3 



s 

o 

U 
O 

U 



5 





Eh 




CD 


< 


< 








O 




cd 


< 


Eh 




u 


u 


Eh 




cd 




Eh 




Eh 




Eh 


u 


O 


Eh 


Eh 


< 


O 


Eh 


O 


Eh 


EH 


O 


CD 


CD 


Eh 


Eh 


U 


< 


Eh 


CD 


CD 


CD 


< 






Eh 


CD 


Eh 




< 


EH 


Eh 


CD 


CD 


CD 


O 


< 


CD 


U 


Eh 


cd 




CD 


U 


u 


cd 


< 


U 


o 


< 


cd 




Eh 


o 


Eh 


CD 


Eh 


cd 




Eh 


O 


u 


CD 


o 


CD 


< 


O 


U 


<< 


o 


cd 




CD 




u 


< 


(J 


Eh 






$4 




cd 




CD 






Eh 








3 


s 

**** 








r j 


r l 






Fh 


rj 


F-l 




Fh 








n 




F-l 


to 










n 


tl 




ri 


Ph 






Fh 


rj 


Fh 




Cj 


rj 
>— x 


Fh 


Fh 


id! 


Fh 




rn 

v«/ 




rj 


FH 






Fh 




t*l 














< 


o 






cd 




CD 


o 


Eh 


o 


Eh 


*: 


Eh 




U 


Eh 


Eh 


o 


Eh 


cd 


U 


o 


Eh 


Eh 


Eh 


EH 


CD 


a 


O 


Eh 


cd 




< 




o 


cd 


U 


Eh 




cd 


Eh 


O 


u 


o 


Eh 




< 




CJ 


u 


cd 


a 


Eh 


Eh 


< 


Eh 


O 


a 


cd 


Eh 








< 


g 


3 



r:5 



5 



Eh O 
Eh Eh 
O O 



5 



Eh U 



< Eh 



CD 



C3 



SUBSTITUTE SHEET (RULE 26) 



62/74 




SUBSTITUTE SHEET (RULE 26) 



WO 97/41152 



PCT/US97/07022 



63/74 



CD 
CD 



to 
as 

H 

co 

CM 



CM 

rH 
id 
o« 
a; 

CO 




o < o 

O O Eh 

3 (J U 
H Eh 



SUBSTITUTE SHEET (RULE 26) 



WO 97/41152 



PCT/US97/07022 



64/74 




CVJ 
CO 
CD 



SUBSTITUTE SHEET (RULE 26) 



WO 97/41152 



PCMJS97/07022 



65/74 



06, 
CO 



5 



13 O 



CJ CJ 
< Eh 



CJ CJ 



ID O 
< < 



CD 



CO 



H CJ 

a cj 



cj a 

H Eh 
O U 



cj cj 

o 3 

cj cj 

Eh < 
Eh O 
C3 CJ 

cj cj 



u cj 

E-« Eh 

Eh < 

CJ cj 

< < 

cj cj 



5 



cj o 

Eh 13 
Eh *J 
CJ CJ 

cj cj 

13 rtj 



CJ CJ 
E-» h 



CD 
CD 



SUBSTITUTE SHEET (RULE 26) 



WO 97/41 152 



PCT/US97/07022 



66/74 



CD 



D 

« 
CO 



SUBSTITUTE SHEET (RULE 26) 



WO 97/41152 



PCIYUS97/07022 



67/74 




SUBSTITUTE SHEET (RULE 26) 



WO 97/41152 



PCT/US97/07022 



68/74 



O Eh 
Eh U 
O O 



3 



Eh O 



U O 



O Eh 



3 52 



o u 



5 



5 



o < 
o o 

Eh Eh 



O Eh 



CVJ 

i 

CD 



a 



+j 
o 

I 

w 

3 

N 
<M 

o 

<u 
u 
n 



a) 



o 
id 

o 
c 

e 

to 



M 
id 




SUBSTITUTE SHEET (RULE 26) 



WO 97/41152 



PCT/US97/07022 



69/74 




Eh Eh H H 
CO CO CO CO 
^ J tJ i-l 

o» w o» w 

W (O W H 
M M M M 

wwwa 




^ ^ i*S 



GO 
CD 



cn co co co co 



rM »-q »_q 
H O O 
Of o 

s s: 

M M 

a a 
•J ^ 

Q Q 



to a; 
r* Q W 
cr» W W 
«•» Ui Oi 



a; Qj a* oT 

to CO (0 CO 



TJ t» Q« 
q; a, o< a? 
cj a; a; ct; 
to co co co 



Cj a; 
to co 



SUBSTITUTE SHEET (RULE 26) 



WO 97/41152 



PCT/US97/07022 





SUBSTITUTE SHEET (RULE 26) 



WO 97/41152 



PCT/U397/07022 



71/74 





FSG.19E 



FIG.19F 




FIG.19G 



SUBSTITUTE SHEET (RULE 26) 



WO 97/41 152 



72/74 



PCT/US97/07022 




FSG.20A 



~m E 

M 

p 



FIG.20B 



SUBSTITUTE SHEET (RULE 26) 



WO 97/41152 



l»CT/US97/07022 



SCR Promoter ::GUS 73/7 ^CR Promoter: :SCR 




FIG.21A 




FIG.21C 




FIG.21E 




FIG. 21 B 




FIG. 21 D 




FIG.21F 



SUBSTITUTE SHEET (BULE 26) 



WO 97/41152 PCT/US97/07022 

74/74 




SUBSTITUTE SHEET (RULE 26) 



INTERNATIONAL SEARCH REPORT 



International application No. 
PCTAJS97/07022 



A. CLASSIFICATION OF SUBJECT MATTER 

IPC(6) :C07K 14/415; C12N 1/21, 5/10. 15/29, 15/63; A01H 5/00 
US CL .Please See Extra Sheet. 
According to International Patent Claudication (IPC) or to both national clarification and IPC 

B. FIELDS SEARCHED 

Minimum documentation searched (classification system followed by classification symbols) 

U.S. : 536723.6, 23.1; 435/320.1, 252.3, 419; 530/350, 370, 387.9; BOO/205 



Documentation searched other than minimum documentation to the extent that such documents are included in the fields searched 



Electronic data base consulted during the international search (name of data base and, where practicable, search terms used) 
APS, DIALOG * Biotech Files, GenEMBL sequence databases 



DOCUMENTS CONSIDERED TO BE RELEVANT 



Category* 



Citation of document, with indication, where appropriate, of the relevant 



Relevant to claim No. 



Y, P 



SCHERES et al. Mutations affecting the radial organisation 
of the Arabidopsis root display specific defects throughout 
the embryonic axis. Development. 1995, Vol. 121, pages 
53-62, see entire document. 

WYSOCKA-DILLER et al. Root radial organization. Plant 
Physiology. June 1996, Vol. 111, No. 2, abstract no. 
40001 , page 1 2, see entire abstract. 



1-28 



1-28 



j l Further document! are Listed in the continuation of Box C. [ | See patent family 



tobW 



f dense* *e 



•L- 

•r 



: of U* art which it not oon a idfd 



am or after die ettersetiooei lUbf data 



art wbicb nay straw doubto em priority ckim(a) or which 
i eotablaab *m p wbt kr atieai data of 
w a aoa (at ■pacified) 



deceases* published prior to iha 



filial dale but later dun 




Date of the actual completion of the international search 
15 AUGUST 1997 



Name and mailing address of the IS A/ US 
Commissioner of Patent* and Trademarks 
Boa PCT 

Washington, D.C. 20231 
Facsimile No. (703) 3050230 



Form PCT/1SA/210 (second sheet)(July 1992)* 



Date of mailing of the international search report 

0 3 SEP 1997 



Authorized officer 

ELIZABETH C. KEM MERER 
Telephone No. (703) 308-01 96 



INTERNATIONAL SEARCH REPORT 



No. 



PCT/US97/07022 



A. CLASSIFICATION OF SUBJECT MATTER: 
USCL : 



536/23.6. 23.1; 435/320.1, 252.3. 419; 530/350, 370, 387.9; SOO/205 



Form PCT/ISA/210 (extra iheetMJuly 1992)* 



